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ABSTRACT 



Project URBANDOC reports on four years of activity 
as an Urban Renewal Demonstration Project at the City University of 
New York. The Project aims toward improvement of bibliographic 
services in urban affairs. URBANDOC is one of the first of the 
library-information sciences systems to deal specifically with the 
social sciences. The final report consists of three volumes: the 
"Demonstration Report," the "General Manual" (Technical Supplement I) 
(see II 002 881), and the "Operations Manual" (Technical Supplement 
2) (see LI 002 882) . The "Demonstration Report" provides an over-all 
view of the objectives, features, accomplishments, and conclusions 
and recommendations of the Project. Appendices contain the Prototype 
Retrieval Reports, and the Prototype Input Index. (Author/HM) 
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PREFACE 



Project URBAN DOC is reporting on four years of activity as an Urban Renewal 
Demonstration Project at The City University of New York. The project evolved from a 
need for improving bibliographic services in urban affairs — and specifically urban renewal 
- at a time when computer technology was being incorporated into a wide range of 
information systems. URBANDOC was one of the first of the library-information science 
systems to deal specifically with the social sciences. 

The final report consists of three volumes: the Demonstration Report, the General 
Manual (Technical Supplement 1), and the Operations Manual (Technical Supplement 2). 
Each of these is bound separately and intended for separate distribution. For the most 
general reader who wishes an over-al! view of the objectives, features, accomplishments, 
and conclusions and recommendations of the project, the Demonstration /?eporf should 
suffice. 

The Genera! Manual is designed to provide the reader with detailed knowledge of the 
techniques developed for handling the documents according to library-information 
science practices as developed by Project URBANDOC. While it also provides an ovei'view 
of the programming system used by the project, the Operations Manual should be 
consulted for detailed systems analysis, programming, and operations data. 

The U.S. Department of Housing and Urban Development has been most generous in its 
assistance of Project URBANDOC, from project submission to final report. HUD's 
committment to the Demonstration was as important conceptually as it was econ- 
omically, and the University's indebtedness is thus two-fold. The President and 
Deans of the University Graduate Division join the New York City Planning Commission 
and the URBANDOC staff in thanking the Department for having made possible each 
of these three final volumes, as well as the entire project. 
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Introductory Summary 



INTRODUCTORY SUMMARY 



The objectives, essential features, accomplishments and conclusions summarized below 
resulted from a demonstration effort that was carried out in the format of a small semi- 
independent documentation facility located in an academic environment. It should be 
pointed out that the project would have had essentially the same characteristics if it 
had been set in the context of a library or an operating agency. 

Objectives 

Project’ UR BAN DOC was concerned with a particular kind of urban information problem 
— that posed by the ever-increasing literature on urban affairs. The purpose of the project 
was to devise and test an automated information system based on techniques already 
under development in other areas, techniques that employ computer technology to gain 
better access to technical reports and articles than has hitherto been available. 
UR BAN DOC was, therefore, working with the kind of system that uses bibliographic 
records as its data base, and that services its users by providing them with citations to 
books, journals, and official documents that seem relevant to their specialized 
information needs. 

The over-all objectives were to be realized through the demonstration of five separate end 
products. The first two were designed specifically for practitioners In urban development: 

Urban Objectives 

(1) A machine-readable file of bibliographic records containing extensive subject analysis, 
together with the computer programs to search those records In answer to specific subject 
needs; and (2) A group of publications programs which can use the bibliographic file to 
produce more generalized subject and other indexes for general distribution. 

The first of these end products is exemplified by the prototype Retrieval Report 
(Appendix A); the second is exemplified by the Input Index (Appendix B). Neither of 
them was to be put into operation by UR BANDOG, only tested as to feasibility. 

Library-Information Science Objectives 

The other three end products were: 

(3) A thesaurus of the indexing terminology employed in constructing the analytic 
portion of the bibliographic file; (4) A segment of the entire project that might be 
suitable for local use; and (5) A set of project manuals that would describe the entire 
system in sufficient detail to enable potential users to adapt URBAN DOC to their own 
needs. 

These three products — all contained in the two manuals — are intended for the library 
and information staffs rather than the urban technicians. 

1 
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Essential Features 



As the objectives indicate, URBANDOC's contribution to the urban scene was to locate 
sources of information in published documents, not to launch a broadside attack on the 
entire spectrum of information science. Its essential features were, therefore, those of a 
discipline now being called Library-Information Science at The City University of New 
York. 

Project Limitations 

E* on within the boundaries of Library*! nformation Science, URBANDOC was a limited 
endeavor. It was intended to explore rather than exploit all the potentialities of a 
bibliographic information system for urban affairs. Its document base was "typical” 
rather than comprehensive; its output consisted of prototypes rather than initial issues of 
ongoing services. Moreover, the total budget of half a million dollars left little room for 
testing many alternative courses of action. The essentia! design of the original submission 
prevailed over most of the implementation activities. 

The final data base consisted of six thousand bibliographic records, far larger than most 
test bases, but far less than would be expected of an operational Information system. The 
same is true of the staff, which consisted of ten members at Its height, with never more 
than two full-time positions devoted to systems analysis and programming. 

Systems Limitations 

In terms of project design, the essential feature was the development of a programming 
system that utilized as many existing systems as could be found and adapted to the needs 
of URBANDOC. It started — and finished — as an integrated group of programs that 
could all be executed on IBM 1401 computing equipment. They depended upon an input 
that consisted of bibliographic records created by manual indexing, or document analysis. 

Although URBANDOC was essentially a product of second-generation computing 
languages and equipment, the staff did give considerable thought to both the advantages 
and the problems of the emerging third-generation environment. 

Activity Span 
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URBANDOC s work was divided between purely developmental activities and those that 
simulated production. The production part started with the initiation of contacts with 
document-producing agencies and publishers. The documents were then acquired, and six 
thousand of them selected for inclusion In the demonstration system. From then on, they 
were subject to all the analytical and data entry procedures that resulted from the 
developmental side of the project. 

At the other end of the production-type work flow, there were both internal uses of the 
system to test publications and retrieval capability, and field-test issues of the input 
tndsx and RGtriGvsl RGports to secure feedback from representatives of the potential 
client community. The output side included all the computer processing necessary to 
produce the bibliographic products, and all the supporting services necessary for their 
reproduction. 

2 
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Introductory Summary 



Field Testing 

Asa demonstration project, URBANDOC felt it necessary to expose Its products to public 
scrutiny in time to secure and utilize feedback from potential users. The concept of 
field-test publications was therefore devised, and a great deal of time devoted to its 
Implementation. First the Thesaurus, then the Input Index, and finally the Retrieval 
Report were distributed to a variety of urban professionals. 

l\\e Input Index was tested most extensively: four editions were produced and evaluated 
before URBANDOC settled on the present one as the most satisfactory within the various 
project limitations. About five thousand individuals were involved in the entire series of 
field tests as recipients of field-test products, some of them several times. 

Accomplishments 

The accomplishments of Project URBANDOC are manifested in two formal ways: in the 
programming system that was submitted to HUD In the form of tapes, punched cards, 
and printout listings, and in the written report. 

The System 

URBANDOC used as the core of its system a set of computer programs from the IBM 
Program Library: the Combined File Search System (CFS). It fulfilled the majority of 
URBAN doc's requirements in the areas of thesaurus, file maintenance, and search as 
well as allowing for a gradual expansion to a total systems approach. This set of programs 
was designed for use on the IBM 1401 computer. CFS — with modifications by 
URBANDOC and other users — provided the project with its retrieval capability. 

The publications capability was URBAN DOC-designed and -programmed. Originally in 
AUTOCODER, the programs were converted to COBOL in order to make them 
machine-independent. To them was added another group of programs to perform various 
editing functions; these also are machine*independent. Together, the two groups, ui* 
modules, constitute the local subsystem. 

The Documentation 

The documentation is more detailed than usual. Although the original project documents 
made little reference to this responsibility, the systems staff that came Into HUD during 
the project furnished a great amount of guidance. The documentation includes program 
inventories, abstracts, input and tape specifications, data entry procedures, descriptions 
of processing cycles, operating instructions, error listings and systems messages, tape 
library and report controls, timing and local implementation. 

The Products 

The final consumer products are illustrated by the Input Index and the Retrieval Report, 
In genera) appearance as well as in function they are In line with similar computer- 
produced bibliographic services In the fields of medicine, space, chemistry, and education. 

3 . 
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Had computer-assisted typesetting been available, the similarities would have been even 
greater, 

The Thesaurus is contained in the appendix to the Genera! Manual, Even during the final 
days of report writing, the project was still receiving requests for the Thesaurus that had 
been generated by the appearance of the field-test edition in 1967. Although the final 
version will be over a year old by the time It Is ready for distribution. Inquirers will be no- 
tified as to its availability. 

The local subsystem is harder to visualize except in terms of varying the Input Index to 
meet local needs. A simplified version of the retrieval capability would have been 
possible, but would cause too many problems. 

The project manuals, on the other hand, can not only be visualized but read in their 
entirety. The General Manual includes all the directives necessary for creating the 
bibliographic records that support the system. The chapters on descriptive analysis, 
content analysis, and search are extensive. Those on the computer components are 
designed basically for the non-systems readers. The systems people have an entire 
Operations Manual, 

Operational Data and Cost Analysis 

One important result of URBANDOC's production-type activities was that the project 
could furnish meaningful data for operational and cost-analysis purposes. The results — 
manifested in Chapter VII of this report — contribute to the literature of 
library-information science figures that are not easily available elsewhere. They are based 
on a five-month simulation-of-production period. 

The idea of product/process schedules — proposed by HUD — enabled the project to 
stud.' the inputs and outputs of each individual activity necessary to carry on an 
operational bibliographic information system. The resulting Item counts, manning data, 
and machine times provided quantitative measures and physical descriptions of 
information processes that are sometimes difficult to conceptualize In terms of the real 
world. 



Another benefit of the operational data aniaysis was that it led directly to the cost 
analysis. Personnel and machine costs were studied in detail, with separate costs assigned 
to each of the product process schedules. One of the most significant findings was the 
direct unit cost associated with preparing one item of input in an environment that could 
manage an annual input of ten thousand documents per year. This was $11.54 for 
personnel, and $1.56 for equipment. 

Conclusions and Recommendations 

On Limited Projects 

Although satisfied that it completed its mission successfully, URBANDOC would not 
recommend that a second-generation system be used in the future for other than training 

4 
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purposes. There are still lessons about library*mformation science to be learned apart 
from the data^processlng components. However, the effort necessary to make a system 
work is so great that any investment in another "limited'' project should be carefully 
considered. 

On Operational Systems 

That operational systems can be constructed for Index-type publicatl has been 
successfully demonstrated. The retrieval capability of Information syster-. .. still not 
been completely proven to be economically feasible. However, the L .^dANDOC 
experience indicates that certain kinds of retrievals are more feasible than others, and that 
further work is certainly warranted In view of the great need to control urban 
information^ 

On Economic Feasibility 

Although it has not been determined by professional market survey, there does appear to 
be a real demand for a documentation effort in urban affairs. Whether this Includes an 
effective demand (i.e,, a willingness to pay) for the proposed products and services 
remains to be determined, it also appears that any documentation effort, whether based 
on both a publications and a retrieval capability or solely on a publications and a retrieval 
capability or solely on a publications effort, cannot yet be self-supporting, and that the 
additional funds required to support such an operation must come from the federal 
government. (The factors leading to these conclusions and recommendations are discussed 
In greater detail in Chapter VIII.) 
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PROJECT BACKGROUND 
Originating Environment 

The name "URBANDOC"' now identifies an experiment in the application of computer 
technology to the documentation of the literature on urban renewal. As an Urban 
Renewal Demonstration Project, URBANDOC started its work on July 1, 1965, with the 
aid of a grant to The City University of New York from the Housing and Home Finance 
Agency - Urban Renewal Administration. This became the Department of Housing and 
Urban Development (HUD) in 1965-66. 

As a concept, URBANDOC dates back even earlier, to a pilot study with the same name 
that reported to the 1964 Annual Conference of the American Institute of Planners.O The 
Taconic Foundation of New York had made it possible for a task force to be assembled 
for the express purpose of demonstrating prototype information services geared to the 
needs of urban specialists. The impetus came from a merger of the information interests — 
data and library — of the New York City Department of City Planning; much of the work 
was done in what is now called the Interdepartmental Housing and Planning Library, with 
the then librarian becoming the project director. 

The 1964 effort had created a machine-readable bibliographic file that represented two 
hundred relevant documents, and utilized a group of computer programs from the IBM 
Program Library that were similar to programs being used as the basis for documentation 
efforts in the "'hard" sciences. From the various URBANDOC experiments came results 
indicating that further development was warranted. 

The project developed in an environment that included (1) documentation and 
information science, (2) urban information systems, and (3) planning librarianship. 

Documentation and Information Science 

Project URBANDOC was most closely related — in concept as well as in execution — to 
those developments in information systems known variously as library automation, 
documentation, and information science. All these systems share the attribute of dealing 
with bibliographic representations of published materials, whether the goal is to speed up 
existing operations (circulation control, accounting, and catalogue card production) or to 
store and retrieve references to satisfy highly individualized information requirements 
(the use of models in transportation planning). 

URBANDOC's interest was less in the purely automative functions, and more in the 
bibliographic service area^ such as information storage and retrieval and the production of 
multiple-access indexes. Both of these already had operating precedents in such fields as 
medicine, space technology, and engineering. These precedents — in theory as well as in 
application — were to be adapted, if possible, to the requirements of the urban literature. 

ovivian S. Sessions, URBANDOC: A Report on Computerized Documentation and information 
Retrievai in the Literature of Urban Planning and Renewal (New York: Institute of Public 
Administration, 1964), 24 pp. See also, Howard Sentley and Richard May, Jr., /'URBANDOC, a 
Cooperative Project of Librarians and Planners.'' In Special Libraries (April 1966), pp. 244^246. 
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Project Background 



Since UR BANDOG did not start with fundamentally new responsibilities in information 
science, it is pertinent to inquire as to the state of the art on which it so depended, 
During the very period (May-September 1965) when UR BAN DOC was firming up as a 
federally- aided project in the hitherto untouched social sciences, the Systems 
Development Corporation was studying documentation in science and technology for the 
Committee on Scientific and Technical Information (COSATI). The report that resulted is 
an excellent guide to the plans and accomplishments of the time.o 

In 1965 there fewer projects than now, certainly fewer that had reached operational 
status. However, then and throughout the demonstration period, URBANDOC devoted 
considerable effort to keeping in touch with relevant developments. Local and national 
meetings of the American Society for Information Science and the Special Libraries 
Association provided a great many opportunities for seeking the benefit of outside 
experience. The literature of documentation, the project's many personal contacts, and 
various specialized meetings also helped to keep open the avenues of communication. 



Urban Information Systems 

Although the intellectual and technological basis of URBANDOC was in documentation 
and information science, to the urban professionals it was a specific kind of urban 
information system. The project encouraged the viewpoint that references to specific 
documents were no less important, in the over-all information universe, than statistics and 
other discrete data items, especially if the data items have to be located via bibliographic 
systems. 

Much of the original impetus for the demonstration came from contact with those 
members of the planning profession who were investigating the applic^.'iility of data 
processing to their own problems. Both the American Institute of Planners (AlP) and the 
American Society of Planning Officials (ASPO) scheduled discussions on data processing 
at their annual conferences from 1960 on, with models, data banks, and computer 
graphics receiving their share of attention. By the time that URBANDOC wa.^ officially 
underway in 1965, the interest in computer technology had spread to other important 
segn.:;nts of the urban development community. The Urban and Regional Information 
Systems Association (founded in 1966) became one focus of the spreading interest; 
another was the Annual Symposium on the Application of Computers to Urban 
Problems, initiated by the New York section of the Association for Computing Machinery 
a year later. 

The interaction between URBANDOC and the urban information community had 
considerable effect on the project. Some of the earlier associates became members of the 
National Advisory Council for the demonstration, as elaborated later in this chapter. 
After that, the greatest single effect was, perhaps, the sharpening of the project's 
awareness of the need for sophisticated geographic access to the literature. 



OLaunor F. Carter and other*. National Document-Handling Systems for Science and 
Technology (New York: Wiley, 1967), 344 pp., Includino selected bibliography. 
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Planning Librarianship 



The awareness of urban information systems, and the bridge between those applications 
of computer technology and documentation, was provided by planning librarianship. 
Although that library specialty is not nearly as well known as the older specialties dealing 
with the medical and legal literature, it too has a national organization, the Council of 
Planning Librarians (CPL), and even independent representation on the Council of 
National Library Associations. Organized in 1959, CPL has grown steadily in member- 
ship, size of book collections and numbers, of users serviced. The largest portion of its 
members serve either governmental agencies concerned with physical planning and urban 
development or university library collections related to planning and architecture. 

Related by subject interest is the Planning, Building and Housing Section of the Special 
Libraries Association. Established as a Section of the Social Science Division in 1961, its 
members serve a widely diversified group of institutions, ranging from advertising 
agencies, trade associations and publishing companies to institutions similar to those 
served by the council. Many section members are also council members. 

On several occasions in the early 1960's, the Council of Planning Librarians attempted to 
solve a serious problem in that branch of the profession: the limited number and 
usefulness of existing bibliographic tools. There were frequent discussions at its annual 
meetings of an indexing-abstracting journal, an idea that received favorable comment 
throughout the planning community. However, attempts to Interest funding agencies had 
not been successful until the time that UR BAN DOC announced an index publication as 
one of its goals. 

Also under discussion by CPL during that period was the development of a 
subject-heading list. The Thesaurus to be compiled by URBANDOC was a response both 
to that need and to the internal needs of the project. In order to fulfill Its retrieval 
responsibilities, URBANDOC had to devise an indexing terminology appropriate for a 
data base designed for computer- searching. Its secorkd responsibility in this area was for 
terms that could be used as the basis for a subject approach in the projected Index 
journal. As the project progressed, there was a growing emphasis on terms with 
"stand-alone" value, and the URBANDOC Thesaurus became looked upon by many 
librarians as a guide to the development of their own specialized subject terminology, one 
that could be used in conjunction with the 1962 Subject Heading List of the HHFA 
Library. 

Throughout its duration, URBANDOC maintained close association with both the 
Council of Planning Librarians and the Planning, Building, and Housing Section. From 
their memberships came very useful and perceptive critics of Project URBANDOC. 



Demonstration Status 

The pilot operations of URBANDOC confirmed the general impression that considerable 
work would be necessary to the development of a computerized information storage and 
retrieval system that could meet the needs of both practitioners and librarians in the 
urban field. The Urban Renewal Demonstration Grant Program in what is now HUD 



Project Background 



appeared to be a logical source of funds, and the necessary contacts were Initiated. 

Local Sponsorship 

To meet requirements for status as a demonstration project, more formal institutional 
relationships were required than had been necessary under the Taconic grant. By this 
time. The City University of New York (CUNY) had become interested in the project, 
and the university decided to provide the official local sponsorship with all of the 
accompanying responsibilities of that status. The New York City Planning Commission 
offered to be joint sponsor. 

In May 1966, The City University applied to HUD for a grant; in June 1966, Urban 
Renewal Demonstration New York D*9 was approved. In 1968, the original grant was 
augmented to provide for an additional year. 

Throughout this period, the adtninistration of URBAN DOC in terms of fiscal and 
personnel management was handled by the Research Foundation of The City University 
of New York under the terms of a third-party contract between the foundation and the 
university, and approved by HUD. Both the foundation and the project were located at 
the Graduate Center. The president of the University Graduate Division was in effect the 
principal university official in. all policy matters affecting UR BAN DOC. Format relation- 
ships with other parts of the university — such as Baruch College — were handled through 
her office. 

Objectives 

The specific objective of the URBANDOC demonstration was to develop and test an 
automated information system involving the storage and retrieval of bibliographic 
references to those published materials which are the tools of the trade for urban 
planning and renewal practitioners. Computer technology was to be an Integral 
component in the methodology, but the specific goals were to be five products of a 
computerized system rather than the system itself. 

The project was approved to do the following: 

1. Construct a thesaurus of the indexing terminology; 

2. Produce bibliographic references as a result of computer searches of a bibliographic 
data base; 

3. Generate hard-copy indexes by computer publication programs; 

4. Designate appropriate parts of the total effort as local subsystems; 

5. Produce manuals covering the project methodology. 

Although not one of these tasks could be undertaken lightly, the intent was for a project 
of limited scope as compared with all the possibilities for exploring bibliographic 
information systems that might have been assigned. The development of computer 
systems was held to a minimum; the literature was sampled rather than covered in full; 
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the prcxlucts of the searches and publications progranns were prototypes rather than the 
initial efforts of operating systems. 

Advisory Council 

One important characteristic of URBANOOC was the official presence of a National 
Advisory Council. Its members were chosen for both their expertise in relevant areas of 
information science and their ability to represent important segments of the potential 
user community. 

All of the original members, except the chairman, were chosen from outside New York 
City to guarantee the national focus. Their names and their affiliations in both 1965 and 
1969 are listed below to indicate the types of potential users and systems experts whose 
guidance, criticism and evaluation contributed in the development of the project. 
(Asterisks designate members who served for the complete active life of the project.) 

*The Hon. Lawrence M. Orton (Chairman) 

Commissioner and later Vice-Chairman, New York City Planning Commission 

*Mr. Joseph Benson, Chief Librarian 
Chicago Municipal Reference Library 

(Mr. Benson is now Chief Librarian of the Joint Reference Library, Chicago] 

*Mr. Charles A. Blessing, Director of City planning 
City of Detroit 

*Dr. Bernard M. Fry, Director 

Clearinghouse for Federal Scientific and Technical Information, Springfield Va. 

(Dr. Fry is now Dean, Graduate Library School, Indiana University) 

*Dr. Robert M. Hayes, Professor of Library Science 
University of California at Los Angeles 

(Dr, Hayes is also now Director, Institute of Library Research, UCLA.) 

*Mr. Edward F.R, Hearle, Director 

Data System Services, Griffenhagen-Kroeger, Inc., San Francisco 

(Mr. Hearle is now Vice-President of'Booz, Alien and Hamilton, Inc., Washington, D.C.) 

*Mr. John O. Lange, Executive Director 

National Association of Housing and-Redevelopment Officials, Washington, D.C, 

Mr, Dennis O'Harrow (died In 1968), Executive Director 

American Society of Planning Officials, Chicago, succeeded in 1969 by Mr. Israel Stollman 

•Mr. Robert B. Pease, Executive Director 
Urban Redevelopment Authority of Pittsburgh 

(Mr, Pease is now Executive Director,! Allegheny Conference on Community Development) 

Mr. Herman G. Pope, Executive Director 

Public Administration Service, Chicago, succeeded In 1966 by 

Mr. Jacque K. Boyer, Director of Headquarters Services, Public Administration Service 

•Mr. Robert L. WilHams, Executive Director 
American Institute of Planners, Washington, D,C. 

In 1966 two additional appointments were made to the Advisory Council: 

Dr. Donald L. Foley, Chairman 
Department of City and Regional Planning, 

University of California at Berkeley 

Dr, Theodore L. Hines, Professor of Library Science 
Columbia University 
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Project Background 



The council met formally on ten occasions, eight times in New York and twice in 
Washington, These meetings were also attended by the project director and key members 
of the staff, as well as representatives of the local sponsors and HUD. The director of the 
Demonstration Grant Program and the director of the HUD Library were always present 
or represented. Aside from the formal meetings of the Advisory Council, individual 
members were frequently consulted by mail or telephone, or in person at meetings of the 
organizations mentioned earlier in this chapter. 

Implementation 

Project Staff 

Helpful as the sponsors and advisors were, the actual implementation of the work 
program was of necessity a staff responsibility. The librarian who had developed the pilot 
project was designated as project director. It was her responsibility to recruit and train 
the other professionals who would be required, subject to the approval of the university. 
Two basic kinds of technical positions developed -- document analysts-librarians and 
systems analysts-programmers. When the staff was at its largest there were three analysts 
in each category, plus two secretaries and a keypunch operator. 

The director, senior systems analyst, and senior document analyst were with 
URBANDOC for its entire active life. Six other librarians and three other programmers 
worked for the project for varying periods of time. The more permanent members of the 
staff felt that turnover would have been even lower had there been assurances of 
continuing federal support. 



Although the total number: of professionals Involved was too limited to permit sweeping 
generalizations about recruitment, project experience indicates that recent graduates of 
library schools are excellent candidates for jobs as document analysts. This is particularly 
true of candidates who had taken one or more courses in information science. However, it 
was essential that their interest in computer applications be as least equalled by strong 
interests in the intellectual problems of cataloging and indexing. The reason for 
emphasizing recent schooling is that more experienced librarians often seemed to have 
difficulty in adjusting to a machine-based system. A large operational documentation 
facility — with extensive in-service training opportunities — would be in a better position 
to remedy this situation than was possible in a demonstration effort. 



The decision to recruit document analysts from the library community rather than from 
amongst the planners and other urban professionals stemmed partially from experience 
during the pilot period (Taconic Foundation funding). It appeared that although the 
non-librarians had a great deal to offer in the way of understanding the literature, their 
lack of training in the techniques of indexing and cataloging was too serious a handicap to 
be overcome by a demonstration project. Furthermore, tentative investigations into 
possible candidates indicated that it would be difficult to attract urban professionals with 
sufficient competence, since the jobs was both documentation-oriented and of a 
temporary nature. (The latter constraint was less of a problem to the librarians, who 
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tended to view a stint with URBANDOC as a way of gaining experience and on-the-job 
training in a new and attractive part of their own profession.) 

On the systems side of recruitment, the first requirement was over-all systems and 
programming competence. Here too, flexibility was essential. Bibliographic information 
systems — and a demonstration project environment — are quite different from the 
business world in which systems analysts and programmers usually operate. Fortunately, 
it was possible to attract staff members who were not only competent and flexible, but 
willing to learn a new kind of application of computer technology. Since fewer positions 
and fewer recruits were involved in the systems part of the URBANDOC staff, the project 
had too little experience with this kind of recruitment to warrant many recommendations 
for future URBANDOC-type efforts. 

If the matter of project staff seems to be receiving more than its proportionate amount of 
attention in this report, it is because in-house capability was at a premium in a situation 
where outside consultants were not provided for in the project budget. There was, 
however, un-paid input from many external sources. Members of the National Advisory 
Council were "built-in" consultants who received no remuneration other than trav> ; 
expenses. Representatives from equipment manufacturers, especially IBM, users of 
systems similar to URBANDOC such as Engmeerfng Index, and the National Institute of 
Mental Health made themselves available to discuss individual and mutual concerns, as did 
users of different systems, e.g., the American Petroleum Institute. The URBANDOC staff 
talked with professional colleagues active in documentation and systems development at 
seminars, conferences, and meetings. 



Project Facilities 

Project offices were located at The City University Graduate Center. Because the 
Graduate Center had not yet installed its own data-processing facilities in 1965, computer 
time was made available at other New York City installations.' In 1966, arrangements 
were completed for URBANDOC to use the newly-acouired facilities at the Computer 
Center of Baruch College, a senior college of the university. The Baruch College computer 
was used approximately eight to ten hours per week. 

The only data-processing equipment at the URBANDOC offices were a keypunch 
machine and, for a time, another encoding device which created data input on magnetic 
tape. The computer system operating tapes and the tapes containing the bibliographic 
records were kept at the Computer Center. However, printouts as "hard copy" of the 
computer processing were always returned to the office for examination. Those copies 
with listings that pertained to the bibliographic records were kept near the physical 
collection of the books, reports, and journals. 

Although URBANDOC did not attempt to operate as a library, the materials that had 
been entered into the system were retained for future staff use, stored in vertical files in 
the same sequence in which their bibliographic representations were stored on computer 
tape. On those occasions when the project received outside requests for access to the 
materials, the document analysts took turns furnishing the necessary personnel services. 
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Project Background 



The experience gained from these encounters with 'live'" users more than compensated 
for the small loss in operating efficiency. They also tested the ability of various spinoff 
printouts to serve as manual bibliographic tools. 

Work Program 

The design of a work program scheduling anticipated activities necessary to carry out the 
objectives of the project was included in the 1965 grant application. After four years, the 
schedule can still provide useful guidelines for other institutions considering 
computerization for information storage and retrieval. A synopsis of the project activities 
in chart form, arranged chronologically with the years subdivided into quarters as 
required for the project's progress reports to HUD, follows. 

The major alteration in the original work program was the time change from three to four 
years. Substantial enlargement of the date-processing effort was one of the factors in 
requesting augmentation of the grant. Other factors were the time needed to originate 
cost-analysis information and the development of manuals for publication of the project s 
detailed procedures. 

Project Staff Composition by Function 
Original 3-year period 
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Project Activities by Intensity and Project Quarter 
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Project Background 



Project ActivitY Comment Points 
(Chronologically) 

1 . Review of the Thesaurus by panel 

2. CFS Users' Group begun 

3. Adoption of Census Bureau Geographic Identification 

4. Thesaurus Field Test, first printing 

5. Start of DATATEXT experiment for data entry (continued for 2 quarters) 

6. Thesaurus Field Test, second printing; Thesaurus transmitted to Clearinghouse for 
Federal Scientific and Technical Information for further distribution 

7. Input Index #1 distributed 

8. Input Index #2 distributed 

9. Input Index #3 and Cumulative Index #1 distributed 
to. Input to Document File #1 closed with 4000 documents 

11. Start of language conversion for program in Local Systems Module (from 
AUTOCODER to COBOL) 

12. Search Expansion Programs obtained from the Engineering index; Increased 

retrieval ) 

1 3. Input I ndex #4 distributed 

14. Operational Data and Cost Analysis (based on experience of October 1 
1968‘February 29, 1969) transmitted to HUD 

15. Input to Thesaurus closed, both subject and geographic 

16. Input to Document File #2 closed with 2000 documents 

1 7. Prototype Retrievai Reports distributed at ASPO Conference 

18. Draft reports start being submitted to HUD 
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GOALS: EVOLUTION AND RESOLUTION 



The goals of Project URBANOOC were, as pointed out In the "Introductory Summary", 
the following. 1. A thesaurus of the Indexing terminology; 2. Bibliographic references 
produced as the result of computer searches; 3. Hard-copy indexes generated by 
computer publications programs; 4. Local subsystems; 5. Manuals. 

These goals were all resolved as physical products, each of them manifested in the three 
volumes of the final report series. Two of them are exemplified by materials appended to 
this volume: the sample Retrieval Report that constitutes Appendix A is the result of 
computer searching; the prototype Input index that is Appendix B represents computer 
publications. The subject Thesaurus is published in its entirety as the Appendix to the 
Genera! Manual, the General Manual and the Operations Manual together represent the 
goal of project manuals, as well as provide the directives for the goal of local systems. 

This chapter is concerned with the processes by which the original goal'^ were realized, 
and with the evolutions that took place during their realization. It is basically 
input-oriented, concerned with the intellectual problems in creating the bibliographic 
data base. Chapters IV and VI deal with the physical documents represented by the 
bibliographic data, and with the computer processing required to convert the data base 
into meaningful physical products. Chapter V considers the output products in the user 
environment, and discusses the field tests. 

Thesaurus 

The URBANOOC Thesaurus was both the first goal to be enunciated by the project and 
the first product to be submitted to public scrutiny. 

Function 

As used in bibliographic information systems, the word "thesaurus" means an 
alphabetical listing of the terms used by a particular information facility to index the 
subject content of its documents. The individual terms in the listing -- called descriptors 
when they are being applied in a computer-aided environment — have appended to them 
cross references and scope notes that provide guidance fcr their proper usage. Non-urban 
bibliographic systems, such as those in the areas of medicine, education, and space each 
have their own thesauri. 



The preparation of at least a draft thesaurus Is a necessary prelude to any documentation 
development in which the computer system requires that the subject analysis be by a 
controlled vocabulary. At any single point in time, the descriptors in the bibliographic 
records must be the same as those authorized by the thesaurus. 



Although it is technically possible to construct a bibliographic information system that 
will perform computer searches without a thesaurus. Project URBANOOC is based on a 
programming system that does require the controlled vocabulary. The Combined File 
Search System, and the reasons for choosing It, are discussed elsewhere. Although 
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Goals: Evolution and Resolution 



URBANDOC did examine other ways of handling content analysis, especially in relation 
to third-generation equipment (such as the IBWI System 360), the original purpose of a 
thesaurus as part of the present demonstration was not changed. 

The decision for a controtled-vocabulary system was greatly Influenced by the easy 
availability of the particular set of programs through the IBM Program Library, and by 
the apparent ease also of implementing the system. However, this decision was backed up 
by a fundamental theory that content analysis should proceed on an organized basis, and 
that such organization Is well handled by a thesarus. In conventional library terms, the 
thesarus is comparable to the subject-heading list; in planning terms, It parallels the 
land'Use classification. 

Alternative methods of content analysis, such as "text-processing" systems, depend more 
on the terminology in the particular document. While each method has its virtues and its 
supporters, a comparative analysis of the two by far transcends both the goals and the 
responsibilities of the present demonstration. The text approach did, however, influence 
URBANDOC in the selection of its descriptors, as is indicated in the following section. 

Descriptor Cuiididates 

Word Lists 

There were few published thesauri for computer-aided documentation available when 
URBANDOC started its work, and they were not useful for the analysis of urban 
documents. Published candidates for descriptors in the URBANDOC system thus had to 
be culled from other sources, some of them library-oriented and others user-oriented. 
Examples of library-oriented sources are subject-heading, lists, such as the one published 
by the Library of the Housing and Home Finance Agency (later HUD) In 1962 and 
subject-organized acquisitions lists, such as the "Recent Publications of the Joint 
Reference Library in Chicago, which presently appears biweekly. 

The other kind of source is best exemplified by the index to the Urban Renewal Manual 
of HUD and by the Standard Land Use Coding Manual, published jointly by HUD and 
the Bureau of Public Roads. The publications and catalogues of the Census Bureau were 
also a prime source for potential descriptors, since the census compilations provide so 
much of the data base for all urban research. 

The indexes prepared by the professional societies for their own publications — periodical 
and monographic — were also considered Important evidence of how the users of urban 
information conceptualize its organization. The American Institute of Planners, the 
\merlcan Society of Planning Officials, the Council of State Governments, the 
International City Managers' Association, and the National Association of Housing and 
Redevelopment Officials thus all contributed — albeit indirectly — to the URBANDOC 
indexing terminology. 



•Th« term "content analy*!*" Is used by URBANDOC to cover both the UBographIc and 
the subject analysis of its documents, sometimes also referred to as lnde>cing. 
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However, URBANDOC realized that all of the word lists already in existence ~ or that 
would come into existence during the course of the project — could only provide 
guidance for the kind of terminology that would best exploit the capabilities of the 
computer for searching on descriptors either singly or in various combinations of AND, 
OR, and NOT relationships. (The one other list compiled with a machine usage in mind — 
the Urban Thesaurus of Kent State University — was not available until 1968, well after 
URBANDOC had completed the bulk of its own thesarus activity.) 

Many of the URBANDOC descriptors were suggested for that express purpose by the 
urban professionals. In this regard the project was fortunate in being able to draw from 
the pilot work that preceded the federal funding. During the Taconic period the ad hoc 
task force — composed largely of members of the New York chapter of the American 
Institute of Planners — met frequently and considered the development of the indexing 
vocabulary one of its prime responsibilities.* 

After the demonstration project had spent six months reviewing the original list and 
adding to it from published sources, a 1600-term compilation was reviewed (April 1966) 
by a second, briefly — constituted panel of planners. 

Still another source of descriptors was the field test that will be discussed more fully 
later. Of the six hundred urban professionals and librarians who saw the published 
Thesaurus of 1967, many contributed still further ideas on how to gain access to the 
literature by subject. In all these cases of user feedback, the resulting revisions affected 
not only the choice of terms for descriptors, but also the scope notes and cross 
references. 

iVord Usages 

In addition to the published word lists and user-suggested descriptors, another guide was 
the documents themselves, particularly those that were considered central to the urban 
renewal process. The federal legislation, in particular, was scrutinized for new 
program-related terms. The term "neighborhood service centers," for example, was added 
after the 1968 legislation. Speeches, directives, press releases, and any other indication of 
new concepts were also considered, particularly those emanating from HUD. Other 
sources were monographs and periodical articles. In this respect the project was most 
influenced by those methods of machine indexing which depend solely on the words used 
by the author of the particular document. 

A final source of URBANDOC indexing terminology was the curricula and syllabi in 
relevant university disciplines. Conceptualization is a major concern in the academic 



*The original group includad Robert E. Barreclough, Herman Berkman, Peres C. Bhattacharjl, S. Robert 
Caso, Robert Genestaw, Samuel Jbroff, Lawrence M. Orton, Philip B. WalHck, and Robert L. Wilson. 
Other planners who participated in that stage were Robert Alpern and Rodman Davit. Between them 
they represented regional and transportation interests at well at the more strictly urban phases of 
planning and renewal and also a mixture of governmental, voluntary, consultant, and acadamic 
approaches. The 560-word vocabulary that emerged from their deliberations was used (n the pilot 
study, and continued at the basis for the formal thesaurus activity under the demonstration. 
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G oafs: E vofu tion and Resofu tfon 



approach to urban or any othe; problems, and therefore the organization of the 
disciplines for teaching purposes would, it seemed, help UR BANDOG in its own 
organization of the terminology. Materials from many major teaching programs were 
therefore assembled and examined, and they did indeed prove useful, especially for 
distinguishing usages and providing scope notes. 

Thesaurus Status 

Although the Thesaurus that is appended to the Genera! Manual fulfills the original 
project goal of a published indexing vocabulary, neither it nor a subsequent edition could 
be legitimately considered a finished work. The urban field is too dynamic for any 
vocabulary to represent more than the state of the art at a given moment. The field-test 
version that was distributed in 1967 represented the status of the vocabulary in May of 
that year. The version in the General Manual conXdms the revisions dictated by two years 
of added usage, including the feedback and the emergence of new candidates as indicated 
indicated in the preceeding section. Input to the system ended in June 1969 with six 
thousand documents, and so did work on the Thesaurus. Although additional terms have 
undoubtedly become candidates for inclusion since then, work on the final report 
volumes prevented their inclusion. 

Internally, within the amount of information given for the individual descriptors, the 
Thesaurus is also a state-of-the-art report rather than a final guide to the relationships 
between the various descriptors. Scope notes were written as the document analysts 
became aware of their necessity to distinguish usages in particular cases. The Thesaurus is 
therefore neither a dictionary nor a glossary, although it does make some contributions 
toward the de\^elopment of these tools. 

It should be pointed out that at any moment in history the published version of the 
Thesaurus could be out of phase with the tape version that Is part of the Combined File 
Search System. This is because the Thesaurus tape can be updated without making a new 
printout. For some revisions, It is necessary only that the appropriate notes be added 
manually to working copies until there are sufficient changes to warrant the expense of 
printing a new edition. 

For those who have to know just how the Thesarus terms have actually been used in 
indexing documents, the system contains another tape and another printout that provide 
the answers. The inverted file lists all the descriptors that have so far been entered into 
the data base, together with a "document number/' The number identifies each 
document for which it plays a part in the content analysis. There is also a listing that tells 
the last date on which each descriptor was either added to or deleted from the data base. 

Descriptors as Subject Headings 

In order for URBANDOC to publish an index, it was necessary to make some decisions 
about how to provide a subject approach to the literature, whether in the Input Index or 
a similar vehicle for manual use. The obvious way to display bibliographic references by 
subject is, of course, an alphabetically arranged list of subject terms. The machine 
capability was provided by the computer programs that URBANDOC devised as part of 
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its "'publications module/" That capability did not, however, require any particular 
intellectual method of assigning the subject terms. 

URBANDOC"s decision to base its Input Index subject-heading list on the retrieval- 
oriented Thesaurus was made after exarrining the consequences of the alternative of 
constructing a separate subject-heading list. The most obvious disadvantage of this course 
of action would be the extra effort involved in learning and maintaining two separate 
indexing terminologies. Furthermore, a dual system would have militated against the 
efforts of conventional library operations to start on the road to computerization by 
experimenting with descriptor-like terms for their interim cataloguing. 

However, in order to make the descriptor list provide meaningful terms for the printed 
subject index, it was necessary to include a sufficient selection of descriptors that could 
stand alone. This accounts for the occurrence in the present thesarus of so many 
multi-word descriptors (referred to by the documental ists as "prebound terms"). OPEN 
HOUSING, CITIZEN PARTICIPATION, and HOME OWNERSHIP PROGRAMS are all 
examples of concepts that could be handled - for retrieval purposes alone — by ii. iking 
OPEN with HOUSING, CITIZEN with PARTICIPATION, and so on. This way, they not 
only make subject headings out of descriptors, but also make it easier to write search 
statements for the retrieval part of the total system. 

The decision to increase the number of "stand-alone" descriptors did present one 
problem: the inevitable lengthening of the Thesaurus. Eventually, it would also cause the 
Thesaurus to become loaded with terms that had enjoyed a brief period of popularity and 
then faded from public consciousness. It would therefore be necessary, in an ongoing 
operation, to strike a proper balance between single — and multiword descriptors. 

Thesaurus Conventions 

The technical considerations involved in devising a thesaurus could constitute a volume in 
themselves. One of them is the use of "natural language", meaning that the subject 
descriptors are used as they appear in the printed version; they are not represented by 
numeric codes. (This is not true for the geographic terms, as is discussed at length in the 
General Manual.) The content analysis was therefore, not a matter of coding but of 
entering appropriate subject terms. 

Another consideration is that the terminology was related to the subject concern of the 
project, urban renewal. Although this, was interpreted sufficiently freely that many 
urban-related disciplines are represented, the present Thesaurus makes no claim to 
provide the terminology that would be necessary for a project with other subject 
orientations. URBANDOC was loath to include terms for which It would have no usage 
experience, and therefore did not take advantage of its freedom in the compilation of the 
terminology. 

In matters of style, URBANDOC generally followed practices common in documentation. 
Plural forms were preferred to the singular, and nouns to adjectives. Cross references 
between more specific and more generic forms also followed common practice, although 
URBANDOC went further in indicating possibilities for coordinating terms as well as 
"prohibiting" coordinations that might lead to redundancies. 
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In order to provide greater assistance to the user of the Thesaurus, URBANOOC also 
made provision for the publication of a 'permuted" form, in which there is one listing for 
every single word in a multiword term. Therefore PLANNING, PLANNING EDUCA- 
TION, and TRANSPORTATION PLANNING all appear together with the planning terms 
as well as in their norma) sequence for term-by-term alphabetizing. 

Retrieval 

The most challenging part of the URBANDOC mission was the development of the 
retrieval portion of the over-all system. A typical retrieval might Involve finding reports 
that describe the activities of regional councils regarding environmental protection in a 
specified Standard Metropolitan Statistical Area. 

Retrieval Results 

The end result for the URBANDOC user would be a computer-produced printout that 
lists such bibliographic elements as author, title, publisher, and any additional 
information that was stored in the first place, and asked for in the second. (The content 
and construction of the data base will be discussed at length later in this volume and in 
the manuals.) Important here is the fact that the typical format of a processing report in 
retrieval systems is the hard-copy printout made at the central facility. Its transmission to 
a user at a remote location is still by traditional means of communication. 

This is far less dramatic than the dream of the information industry to have a terminal on 
ttie desk of every inquirer. However, the chief miracle — the present stage of retrieval 
development — continues to be in the search process; it is not in more sophisticated ways 
of communicating with the computer. URBANDOC was far too involved in trying to 
guarantee the quality of retrieval results to become concerned with other than traditional 
batch processing. 

Systems Choice 

The heart of the URBANDOC search capability was IBM s Combined File Search System 
(CFS), obtained for URBANDOC from the IBM Program Library. The intent had been, 
from the beginning, to adopt or adapt an already existing system; the URBANDOC 
budget and work program made little allowance for major efforts In this area. CFS was 
considered for adoption even before the formal start of the demonstration in July 1965. 
The fact that the Engineering Index, the National Institute of Mental Health, the Food 
and Drug Administration, and others were using CFS was a heavy factor in the 
preliminary decision. 

During the early months of the project, considerable time was devoted to learning the 
system and preliminary tests. A small controlled vocabulary was compiled and some 
bibliographic records were entered into the system, and searches were made. The initial 
results indicated that the system could perform sufficiently well to form the basis of the 
rest of the demonstration. Since there was no other system easily available with the same 
capabilities, and as well suited to the computing facilities, URBANDOC proceeded to use 
the CFS, and to make such modifications as became necessary. 
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The evaluation of retrieval systems is a more difficult subject than might seem to be the 
case. One issue is the programming system; another is the worth of the bibliographic 
citations as the end result of the search process. 

Functionally, the CFS that is the heart of URBANDOC's work in this area performed 
well. The individual programs accomplish what they set out to accomplish, especially 
after some initial "'debugging." However, the system is not easy to learn, an initial defect. 
In addition, extraordinary care is required in all the procedures, both input and output 
(the consequences of errors, and the necessary remedial actions, are discussed in the 
Operations Manuai). More automatic editing features might have been contained In the 
original system, but URBANDOC compensated for this by adding a pre-edit module of its 
own design. 

Efficiency is another matter, one which is difficult to judge in a demonstration 
environment. URBANDOC was in an especially difficult situation in this regard because it 
was evaluating a second-generation system (IBM 1401) at a time when everybody was 
already aware of the greater possibilities claimed for the newer equipment and 
programming systems. In any case, the staff felt that operating efficiency would be a 
problem on the existing version of CFS with a large volume of input and requests, but 
had no opportunity to prove this statistically. 

Search Evaluation — Subject Considerations 

The success of the search system from a content point of view depends on there being a 
match between the content analysis of the documents and the interests of the user. Both 
are expressed in the terms of the descriptors in the Thesaurus, and are explained at length 
in the Genera! ManuaL However, even the clearest of directives cannot guarantee that the 
system will produce a relevant document when needed, since there are so many possible 
ways of expressing concepts, especially in urban affairs. 

A great deal of discussion has taken place among the documental ists in attempting to 
evaluate the results of computing search. Projects associated with the College of 
Aeronautics (Cranfield, England) have provided the information community with two 
important criteria: "recall" (the ratio between the number of bibliographic citations that 
were actually retrieved and number that should have been retrieved) and "'relevance" (the 
ratio between the retrieved citations judged to be relevant and the total number of 
citations in the retrieval report) . 

Relevance is, of coi. .. 3 , the easier criterion to apply since all the evidence is at hand in the 
printout that the computer produces when it has finished searching. The URBANDOC 
staff had two kinds of results to examine: those that answered questions formulated 
internally for testing purposes, and those submitted by outsiders. Both types of searches 
appeared to produce highly relevant lists of bibliographic citations. The staff's own 
impressions in this regard were confirmed by those outside users who provided feedback. 

How well the searches functioned in terms of the possibilities of the entire file is more 
difficult to answer, especially for the outside user who has no knowledge of the preceding 
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input. Once a data base exceeds several hundred items, it is difficult to remember which 
documents should have been retrieved for a particular formulation of user needs. 
However, the staff felt reasonably secure that its searches had functioned in this regard 
also. 

Although the entire question of search strategy is discussed fully in the Genera! Manual, a 
few words about the subject descriptors are also in order here. Searches that can be 
expressed by a single, very precise descriptor have a better chance of success than those 
that require an imaginative coordination of various possible descriptors. In general, the 
narrower the concern of the user, the better will be the retrieval: it is obviously easier to 
find all the references on INDUSTRIAL PARKS than on the entire URBAN 
ENVIRONMENT. The same is true, of course, for manual information systems, such as 
library card catalogues. The only difference is that manual systems have generally shown 
less concern for measuring their response to queries. The real issue in evaluating ^ 
computerized searches would be whether they function better than their manual 
counterparts. 

In the preceding discussion on search evaluation, the implication !ias been that the entire 
master file is being used for "retrospective" searching. There is another use of the search 
capability, called "selective dissemination of information." In this case, only the new 
input is being searched, on some kind of regular input cycle. Ordinarily the questions are 
not new, but a standard "profile" that has been developed for each user or group of users. 
Since the parameters of the search are known before the document analysis, it is possible 
to orient the analysis more accurately toward the users in this situation than for the 
retrospective kind of searching. The performance of URBANDOC in a selective 
dissemination environment is a matter that the staff would have liked to explore next, 
had there been sufficient time. 

Search Evaluation — Geographic Considerations 

Geographic analysis always played a large role in URBANDOC thinking because planning 
and renewal is, after all, for a specific place on the earth's surface. Practitioners in this 
field have a greater need than most to be able to retrieve on the basis of a specific place. 

The project was aware of the problem in relying solely on place names for retrieval. Too 
frequently an area transcends political boundaries, particularly for metropolitan and 
regional planning. The original hope had been to define an area by a grid system, such as 
the coordination of latitude and longitude. Although much effort was spent in trying to 
find appropriate methods of analysis and retrieval, no manageable choice emerged. The 
geographic identification systems developing in other parts of the urban information 
community were not applicable or transferrable to a document^based system. UR- 
BANDOC could not develop its own scheme of analysis due to work program and budget 
considerations. 

By the second year the project could no longer postpone some kind of decision and 
decided that the best solution was to adapt some of the geographic identification work of 
the Census Bureau for United States materials. The numeric codes published by the 
bureau in its Geographic Identification Code Scheme (1961) were the ones used. The 
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project manuals describe in detail the present capabilities for retrieving documents 
according to state, county, and local name, range of city size, and Standard Metropolitan 
Statistical Area. It is a significant advantage that the census system is a national standard, 
recognized by all renewal and planning agencies; of equal importance is the government's 
responsibility for updating the Census Bureau Code. 

Although the ability to retrieve document references on the basis of city size was not 
anticipated In the original project goals, the cost of adding this feature was negligible once 
the Census Bureau code scheme was incorporated into the geographic section of the 
Thesaurus. It provided an answer to the problem of avoiding a large-city bias, since it was 
now possible to restrict a search to cities within certain population ranges. The only 
problem with the present system is its basis on the 1960 census, admittedly out of date 
for many places. It will require correction after the 1970 enumerations Income available. 

Machine Versus Manual Searching 

Although the URBANDOC responsibility was to develop machine methods of dealing 
with bibliographic data bases, manual methods are by no means completely out of the 
picture. The section on publications that follows deals essentially with a computer- 
produced tool designed for personal consultation by the user of the system. Certainly a 
hard-copy index is — at this stage of the URBANDOC art — a better way of browsing 
through a small file than is the machine retrieval capability. 

Publications 

The goal of using the data base and programming system to produce an index journal has 
always been central to URBANDOC. During the pilot stages, author, title, and subject 
listings were produced, and sample pages included in the report that was prepared for the 
American Institute of Planners in 1964. There was a long road to be traveled, however, 
from there to the present Input Index, 

Most important, perhaps, was the evolution in the project's thinking about the amount of 
attention involved in producing a satisfactory index. It had been thought that the index — 
although an important goal in its own right — could be handled as a by-product of a 
bibliographic riystem geared primarily to the goal of machine searching. It turned out that 
considerable additional effort was involved in almost every aspect of the project in order 
to achieve the publications goal in the first place, and then to sustain it. 

Publications System 

URBANDOC chose Its basic programming system, IBM's Combined File Search, with full 
knowledge that it did not contain a publications module — the group of computer 
programs that would be necessary to implement the publications goal. However, a careful 
preliminary examination of the system confirmed the original premise that it would not 
be difficult to interface the basically search-oriented CFS with the kinds of publications 
programs that seemed appropriate. The problem was how much of the publications 
module could be picked up from existing systems, and how much would have to be 
written by the project. 
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It turned out that an independent course of action had to be pursued even though there 
were in existence, by 1965, several generalized publications modules. There would have 
been no problem in obtaining the permuted title index programs from IBM known as 
KWIC (Key Word in Context) and KWOC (Key Word Out of Context). The KWIC-type 
programs were judged unsatisfactory since their strength was in manipulating titles, and 
titles in urban affairs rarely improve with permutation. In addition, the records were not 
long enough to accommodate all the bibliographic data that were associated with the 
URBANDOC documents, especially the lengthy corporate names. 

Additional problems arose when publications modules used by other documentation 
centers were investigated. Some of them were of a proprietary nature, and thus not 
readily available. Others — from federal sources — were not adaptable to the data 
processing that was to be available to URBANDOC. Without access to computer- assisted 
means for composing type, for example, URBANDOC would have to depend entirely on 
the format capabilities of the IBM 1401 computer in conjunction with the 1403 printer. 

Despite these constraints, the staff designed a publications module that it considers one 
of the project's major systems achievements. The system interfaces completely with CFS, 
thus permitting URBANDOC to use the same input for both publications and search. The 
programs can be used either with the current input, to produce current indexes, or with 
the larger master file, to produce indexes to the entire data base. There is also great 
flexibility in the way individual data elements are originally identified and later selected 
for listing, thus allowing for many versions of the Input Index without the need to 
rep:ogram. 

The publications module can also be used independently in a documentation facility that 
is not interested in computer searching, only in producing a printed index. These various 
possibilities, as well as the details of the module itself, are discussed in both the General 
Manual and the Operations Manual. Further discussion on the evolution of the Input 
Index as an identifiable product of the publications module is contained in Chapter V, 
"Product Development". 



Descriptive Analysis 

The effect of the Input Index on the indexing vocabulary has already been mentioned. 
However, the URBANDOC experience indicates that subject is a less secure access point 
to the literature than author, title, and other definitive items of bibliographic 
information. (The matter of data elements is discussed at length in the General Manual ,) 
It was through the process of exploiting these other approaches to the document base 
and turning them into a useful product, that URBANDOC fully realized the significance 
of descriptive cataloguing, which URBANDOC prefers to consider under the more 
inclusive term "descriptive analysis." 

The project had started with a greater interest in the content analysis than in the 
descriptive part of the task, anu assumed that existing library methods of recording 
author, title, and the other basic bibliographic data would suffice. 
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Examination indicated that the COS'\ri authority rules were not prepared to handle 
many of the problems encountered by URBANDOC, particularly those of corporate 
author. The reason was that COSATI was mostly interested In the scientific and technical 
documents, and had devoted its attention to the handling of company names, particularly 
those of the nationally known corporations with many local installations that had to be 
differentiated, such as General Electric. It had little or no experience with the many state, 
regional, and urban agencies contributing to the UR BANDOG data base. URBANDOC's 
own staff therefore had to take on the rather onerous task of establishing workable rules 
for handling these subnational corporate entries in ways that were consistent with the 
spirit, if not the letter, of the new national standards. 

The Genera! Manual sets forth the present principles of the UR BAN DOC bibliographic 
record and compares it with the more familiar library catalogue card. Included is an 
explanation of data elements and the methodology for identifying them — important 
considerations when using the bibliographic record to support a publications program. 
(See chapter II, "Document Identification," for an explanation of the order in which 
references appear in the Input Index,] All the sources reviewed by UR BANDOG in the 
compilation of its descriptive methodology are listed in the bibliography accompanying 
the manual. 

Local Subsystems 

The fourth goal of the UR BANDOG demonstration related to local systems emanates 
from the desire of many institutions (agencies and universities in particular) for directives 
that would enable them to use the UR BANDOG programs internally. It was first 
necessary to discover what kind of use was contemplated. Use of URBANDOG programs 
could be based on either the retrieval or the publications capability. Emphasis was placed 
at first on retrieval: local ability to search centrally produced URBANDOG tapes or the 
ability to construct and search their own searchable files. Unfortunately, neither kind of 
interest in local retrieval capabilities appears feasible in many situations. The high level of 
systems support for retrieval operations is discussed in the General Manual, 

As an alternative course of action, URBANDOG recommends the use of the publications 
programs as the core of a local system. Despite their lower status in terms of glamour, the 
publications programs have a great deal to offer. The records created by these programs 
can be used to produce manual tools, which in turn car. provide limited retrieval 
capability when searched. The possibilities illustrated by the Input Index are extensive. 
URBANDOG recommends that they furnish the basis for local exploration into 
bibliographic control. 



In the complete URBANDOG system, input data are processed first by the edit programs 
and then by the file>maintenance programs (to build the searchable files) and by the 
publications programs (to build the publications files). The system is sufficiently flexible, 
however, to allow data frorp the edit programs to be processed directly by the 
publications programs. (These p.^'ocedures are explained In the Operations Manual,) With 
publications, as with retrieval, it is theoretically possible to construct one's own data base 
and/or use bibliographic tapes created elsewhere as input to a local system. The first 
course of action would seem to present the least difficulty, regardless of the nature of the 
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data. However, with either course of action, the importance of edited data (and, in turn, 
of the edit programs) becomes self*evident. 

Manuals 

One of the most important parts of the project's mission was to produce manuals 
describing In detail the various components of the system — both the programs and the 
data they were to process. The written documentation would serve several purposes: to 
provide guidance for those people who might be called upon in the future to implement 
all or part of the UR BANDOG system; to provide a record of how the project fulfilled Its 
various tasks; and to record U R B ANOOC's experience f ot th e profession of 
library-information science. 

The material is presented in two sections: the Genera! Manual ano the Operations Manual. 
The General Manual contains all the procedures applicable to document analysis, as well 
as the general approach to the programming system. The Operations Manual contains the 
details of the programming system, as well as such program-connected details on the 
input as data entry and error detection and correction. 
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THE DOCUMENT BASE 

No bibliographic information system can be better than the document base that it 
represents. The physical books, articles, and reports must be acquired and identified 
before they can be subjected to the analytical procedures that create the bibliographic 
records. The first three-quarters of this chapter present the considerations that go into the 
construction of the document base. 

After tna individual items are no longer needed by the document analysts, they can be 
either stored, sent to another document collection or library, or discarded. Which course 
of action is adopted depends on the plans for providing physical access to the documents 
subsequent to the kinds of bibliographic access that are provided by the information 
system. Although physical access was not originally an UR BAN DOC responsibility, some 
discussion of it appears indicated by the questions that have come to the project, and this 
is handled by the last section of this chapter. 

Scope and Limitations 

The HUD grant to The City University of New York established URBANDOC's concern 
as a “system for storing and retrieving bibliographic references to published materials 
used by urban renewal and planning technicians." 

There was no itemization of particular subject areas to be covered, and this left the 
project free to determine the specifics of its document base. It became evident that any 
statement of coverage by subject could be only the first step in defining scope, and that 
other statements would have to be developed to deal with documents according to their 
sources, their currency, and their relationship to copyright protection. 

Subjects 

In May 1966, toward the end of the first year, URBAN DOC submitted to HUD a 
statement that established the following principles for determining scope In regards to 
subject: 

The density (breadth plus depth) of the UR BAN DOC coverage must be greatest for 
that literature which is unique to urban renewal: written by and/or for the 
professional practitioners. In the first ring away from this core are those materials 
where a specific relevance is readily apparent, either In the text or because of close 
and well-known association. By definition, the latter includes comprehensive 
planning at the various levels of government, as the context within which renewal is 
effectuated. Radiating out toward an eventual finite boundary of inadmissibility 
lie those books and documents whose prime utility is for other activities, but 
whose relevance to the slum blight problem is sufficient to warrant inclusion as 
resources permit. 

The document base built by the project was reasonably consistent with this early 
definition of parameters, although later events made some expansions inevitable. The 
emergence of the Model Cities Program, the increased emphasis on the private sector, and 
the evolution of planning itself were all reflected by the addition of sample documents. 
Some coverage of urban information systems was also inevitable in terms of the project's 
own view of bibliographic information systems as part of a larger picture. However, the 
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essentially limited nature of the project did not permit the kinds of expansion that might 
have made the final prototype products more truly reflective of the current literature on 
urban affairs. 

The best single measure of the project's awareness of its subject responsibility is the final 
version of the Thesaurus as it appears in the Appendix to the Genera! Manual, Since it 
cannot enumerate the actua' uses of the various terms, it is only a partial measure. Of 
possible descriptors it does show the parameters, the general considerations and the areas 
in which specificity was desired. The listing does not, obviously, include any terms that 
might have become necessary since input to the Thesaurus ended in June 1969; it does 
not include terms that mught have been necessary in an urban bibliographic information 
system that was aided by a federal program other than th'6’ Urban Renewal Demonstration 
Grant program. 



Sources 

URBANDOC policy for definin.i scope statements now contains, as the result, of four 
years of experience, a source statement. The reason is the proliferation of the literature to 
the point where no system can guarantee coverage of an entire field. It Is therefore of 
primary importance for the user to know which documents have been covered by a 
particular information system, and which remain to be covered. 



Potential sources include publishers, issuing agencies^ funding agencies. The publishers are 
not only commercial presses, but ai^o professional societies, foundations, voluntary 
organizations of various kinds, and academic institutions. They may also be consultants 
and industry groups. Issuing agencies, particularly for urban planning and renewal, are 
federal, state, and local bodies. Goth publishers and issuing agencies have to be solicited 
individually by the bibliographic system. In the case of URBANDOC, particular sources 
were suggested by the American Institute of Planners, and then contacted by letters 
describing the project and requesting cooperation. A great many documents, including 
periodicals, were received by this means during the time that the staff was putting records 
Into the system. 



The third type of source, funding body, is exemplified by the Urban Planning Assistance 
Program of the Department of Housing and Urban Development. Although usually It Is 
not the body that issues the report, it can issue directives to grantees regarding 
distribution of the published results of its aided projects. A listing of the cooperating 
funding sources will also help define the coverage of a particular bibliographic 
information system. 



During the URBANDOC demonstration, there was one funding body source, the HUD 
program mentioned above. The agencies receiving Urban Planning Assistance Program 
grants were asked to send one copy of every report to URBANDOC, as part of a 
Depository Library System. Not all the reports were .entered Into the URBANDOC 
system, as the project was not operational in the sense of offering a public service with 
stipulated parameters. 



Copyright Limitations 



A number of would-be users of the URBANDOC system have raised the Issue of 
copyright protection. The important point is that the copyright extends only to the 
content of publications; it does not affect the right of an information system to create 
bibliographic records referring to publications that are legally protected. However, the 
records for such works must not include abstracts that are part of the contents so 
protected. 

It is perfectly permissible to enter into the system a bibliographic citation referring to an 
article in the Journal of the American Institute of Planners without permission from the 
institute so long as the published abstract is not also included. If an abstract is desired, 
the bibliographic facility can either seek permission to include the author-prepared 
abstract, or else prepare its own. 

There are some cases In which not only abstracts but any part of the contents that 
seemed desirable could be Included in the bibliographic record. This possibility exists for 
those documents which are not subject to copyright protection, a frequent situation with 
many government or government-funded documents. 

It should be pointed out that the whole area of information systems is involved in a series 
of legal issues, some of which are not yet resolved by the courts. 

Currency Limitations 

The limited nature of the demonstration did not permit URBANDOC to build a 
document base that could adequately represent both curreni and retrospective materials. 
The URBANDOC decision was for currency. 

During the period that the project was adding bibliographic records to its data base, 
publications would not ordinarily be eligible for representation in the Input Index unless 
they had been received within the' calendar quarter. 

A year would be the usual limitation for inclusion of materials in the searchable file. 
(Exceptions had to be made for non-United States materials which travel by boat, and for 
materials whose distribution was apparently delayed due to some governmental or private 
reason.) The implementation of this policy obviously meant that not al( Input would be 
processed in the same way; in fact, those records not routed through the Input Index 
were processed in separate cycles which avoided the publications module completely. 

The Index was therefore truly a "current awareness" type of publication, one whose 
reputation would be built on its currency. This would also be true of the Retrieval Report 
used for selective dissemination of information, which URBANDOC envisioned as using 
the same data base as the Input Index. With both products, currency was not just an 
internal policy but a matter to be brought to. the attention of users. The experimental 
Retrieval Reports produced for the 1969 American Society of Planning Officials 
conference provided a place for the description of the data base in terms of the receipt 
dates of the documents in the base. 
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Although the particular time frames developed by UR BAN DOC in its demonstration 
phase are obviously subject to adjustment, this additional kind of approach to the scope 
problem seems to be both workable and reasonable. Experience with potential users 
indicates that they react favorably to an explanation of the necessity of limiting the input 
to current materials. In addition, they seem to like the assurance that they can reasonably 
depend on one bibliographic service to furnish them with the first references to new 
materials within the rest of the scope coverage. 

Geographic Limitations 

Geography as applied to subject was limited only in terms of the rest of the subject 
statement. If a particular urban problem or technique qualified for inclusion, it qualified 
regardless of whether the locale was in or out of the United States. The orientation of the 
document analysis, however, was in terms of applicability to the United States, because 
the document analysts used terminology that was oriented toward the goals of the Urban 
Renewal Demonstration Grant program. A more internationally oriented documentation 
effort would, of course^ include a great many more non-American documents, and 
provide additional analytical focuses through enlarging the Thesaurus appropriately. 

Language Limitations 

The scope must be defined by one additional dimension related to geography. As applied 
to the sources, the original understanding was that any sources qualified as long as the 
language of publication was English — whether originally or sumultaneousiy. Although 
most of the English-language sources were in fact American, the document base includes 
government documents and periodicals from the United Kingdom, Canada, and Australia. 
The multilingual sources are represented by the United Nations and the International 
Federation of Housing and Planning. 



Acquiring the Document; 

URBAN DOC did not purchase the materials for its document base, relying instead upon 
the cooperation of the agencies, organizations, and publishers in the field. 

Automatic Transmittal 

Emphasis of the acquisitionseffort in a document facility is more efficiently on the sources 
rather than the individual publications. The de\ermination of terminology that was 
oriented toward the goals of the Urban Renewal Demonstration source candidates, and 
the review of their analytic and retrieval value at regular intervals, should be an important 
procedure in an attempt to implement a documentation effort. During the start-up 
period, the members of the Advisory Council who were executive directors of 
professional organizations were asked to suggest appropriate sources, with the emphasis on 
quality rather than mere existence. This procedure was not publicized lest there appear to 
be an endorsement of particular agencies or their work. In an ongoing operation, 
appropriate determinations of sources will be crucial. 

URSANDOC's experience indicates that it is reasonable to expect automatic transmission 
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of documents within certain frameworks. It is most realistic to expect compliance with 
HUD directives, especially when they have been in effect long enough for all the routines 
to be sufficiently established. The transmission of other materials depends in part on the 
degree to which an agency maintains mailio^ lists. 

Other factors affecting automatic transmittal were connected with the expectation that 
listings in the Input Index would produce large volumes of requests to the issuing bodies 
for the documents themcelves. Organizations interested In selling documents would 
benefit from the publicity, but to others it would be a burden. The answer in the latter 
case seems to be to guide users to other places for copies of the documents, such as the 
National Technical Information Service ^formerly the Clearinghouse for Federal Scientific 
and Technical Information), the professional societies, or commercial services. 

Solicitation 

To the municipal reference libraries, which specialize in state and local documents, 
acquisition is a particular chore. There is no registry of urban-related nonbook materials 
that is comparable to the announcements of new trade and university press bocks in 
Publishers" Weekly. The HUD Library's Housing and Planning References is one good 
vehicle for learning about new documents, and the acquisitions lists of other libraries are 
also helpful. 

However, there must be constant vigilance for the appearance of new sources, a common 
occurrence tn the urban field. This was not a serious problem for URBANDOC, since the 
project was neither operational nor committed to a specific kind of coverage for the 
duration of the demonstration. The staff did watch for mention of interesting new 
sources, and attempted to widen its circle of contributors, without asking for more 
materials than could be handled by the existing staff. 

The project also found it necessary to renew requests to places that had already indicated 
their willingness to cooperate. Sometimes tbeir automatic mailings did not include the 
particular kinds of documents that the project wanted to indude In the system. In other 
Instances, the responsibility for certain kinds of reports might shift to a new agency ~ 
such as in the case of the Model Cities Program — without the old mailing lists 
accompanying the shift. Whether it is a generalized solicitation to a new source, or a 
request of other materials depends in part on the degree to which an agency maintains 
urban Information facility is never-ending. 

It might seem easier for a project such as URBANDOC to publicize its quest for new 
materials, particularly in the professional journals. However, the staff wanted to avoid the 
embarrasment of receiving materials that could not be entered Into the system, whether 
for reasons of irrelevance, poor quality, or just quantity. Even with the quiet solicitation 
of sources, a great many interesting and high-quality documents had to be excluded. 

Selection, or Inclusion 

While the choice of documents starts with the selection and solicitation of the sources, 
the process of assembling the document base continues until the staff has had the 
opportunity to examine the publications themselves. The amount of selectivity that can 
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be exercised at this point is somewhat circumscribed by the scope statement and the 
terms of the acquisitions arrangements. It seems obvious that the staff should be able to 
reject materials that are clearly unsuited to the system, such as newsletters, meeting 
announcements, and other brochures of limited immediate interest. There was no 
question about excluding such materials on the basis of URBANDOC economics. 

A more serious problem concerns quality. There have been many queries as to how a 
bibliographic information system would exercise qualitative judgement in the construction 
of its document base. This is an activity which the present staff has been reluctant to 
explore, especially in view of divergences of opinion as to the merits of many works. 
URBANDOC feels it is better policy to include all documents received that fall within the 
scope statement, but to make any necessary distinctions by selecting different modes of 
treatment. The following three courses of action seem to cover the chief contingencies. 



Automatic Inclusion 

In the case of those sources where a project has agreed to serve as a depository library for 
certain classes of documents, all major receipts with depository status should be added to 
the document base, and indexed as fully as is consistent with general policy. To decide 
otherwise might endanger the ability of the system to provide the proper access to the 
documents. In the case of the Urban Planning Assistance Program reports, which 
URBANDOC received as a depository library, the principle of automatic inclusion was 
indeed followed insofar as project resources permitted. No further selection procers 
determined which reports were to be analyzed other than informal randomization. If, 
during the period that URBANDOC was actively adding to its documentation base, 
another documentation facility had been documenting the entire set of Urban Planning 
Assistance Program reports, then perhaps the choice would have been less random. 

Abbreviated Inclusion 

The depth — or the amount — of analysis that is acorded any particular document is often 
a subtle way pf dealing with the problem of selection. A minimum of document analysis 
establishes the item as part of the document base. However, the fewer the descriptors and 
the fewer the added entries (see Genera! Manual), the less likelihood that a marginal item 
will appear in subsequent products of the bibliographic system. 

The document base is the same in item count whether the analysis has been extensive or 
brief. The latter treatment makes it possible to include documents whose existence 
should be recorded as part of a bibliographic control function, even though their retriev 
for most subject‘Oriented purposes appears to be of negligible value. This kind 
selection decision is, of course, most appropriate in a field that lacks other unified means 
of registering its publications. 

"Retrieval-only" Inclusion 

URBANDOC frequently received documents which were not appropriate for listing in the 
Input Index because they taxed the scope statement, most frequently in terms of 



currency. This did not mean the document should not be represented in the over-all 
system, particularly if it appeared to offer a genuine contribution to the retrieval 
potential of the file. The solution was a routing system. Those bibliographic records 
which were considered appropriate for input to the Input Index were so noted on the 
input form. The others were noted *'for retrieval only" and handled in a separate 
processing cycle that by-passed the publications module. 

One unfortunate consequence was that under these circumstances the Input Index would 
not serve as an index to the complete file, only to a portion. This was not sufficient for 
the project staff, which needed hard-copy access to the entire file. There fore, each time a 
magnetic tape reel was closed, the publications programs were used to index the entire 
reel, not just the records used for the Input Index. The same procedure, or a variation of 
it is adaptable to a fully operational information system. 

Physical Access to the Documents 

On-Site 

In some documentation facilities, a library existed before the developments in 
documentation, and therefore the retention of the documents is not an iriue. This was 
not the case with URBANDOC, the stated mission being to demonstraii bibliographic 
services, rather than physical access to the documents. However, the staff felt from the 
outset that it would require long-term access to the materials apart from any outside uses, 
and therefore the creation of a library turned out to be a by-product of the system. 

Although small, and lacking some reference books that would ordinarily be found in a 
facility serving on-site readers, the document collection had all the attributes of a library 
in the sense of being organized, indexed, and accessible through a variety of bibliographic 
means. In addition, there were professional librarians on hand, since all the document 
analysts held master's degrees in library science. 

The news of such a facility could hardly be kept secret, and many people from both 
within and without the university asked for permission to consult the documents as well 
as the bibliographic tools. Permission was usually granted because the project staff found 
it helpful to have personal confrontations with potential users. The cost of providing 
services to occasional visitors vv?.s compensated for by the opportunity to obtain 
feedback. 

Reproductions 

Not all the users want physical access to the original documents; many of them prefer 
reproductions for off-site use. Each time URBANDOC distributed an Input Index, it 
received requests for copies of the documents, and it was necessary to explain that this 
was not part of the project services. This confusion existed because a service that provides 
either full-size or microform reproductions is not only technically feasible, but within the 
scope of documentation services in other areas. 



Although URBANDOC did not experiment with reproducing part of its data base, there 
was no reason why these services could not be added. 
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PRODUCT DEVELOPMENT 

Although the original project goals were stated in terms of products, the full concept of 
product development evolved later. Constant was the Idea that a single data base should 
be capable of supporting all the various bibliographic services appropriate to an 
information system* The two main services, publications and retrieval, have already been 
discussed in terms of their relationship to the document analysis. However, their 
development as marketable products involves intellectual and practical considerations of 
broad applicability. 

As the project progressed, it became evident that inherent in each of the products were 
several service potentials. Consumer needs, consumer tastes (not necessarily consistent 
with the needs), and production cost were among the considerations that would require 
careful attention in the course of preparation for the market place. The best possible 
Input Index from the point of view of documentation theory would not necessarily be 
the most appropriate one for the project to recommend for distribution to practitioners. 

Field Testing 

The advantage of demonstration status was that both the product and the production 
environment could be developed simultaneously. There was no commitment to provide 
any kind of continuing services, much less one with fixed formats* Succeeding issues of 
the Index or special Retrieval Reports could display different possibilities to potential 
users. The label "field test" on all such product examples issued was intended tp dispel 
any doubts in the mind of the recipient as to the status of the material on hand. 

Thesaurus 

The first hard-copy product to be reproduced in quantity and distributed to potential 
users for comment was the Thesaurus. It was issued in May 1967 and reissued in August. 
The first three hundred copies were reproduced with the help of the New York City 
Planning Commission, and the second three hundred with the help of the Graduate 
Center of the City University. The recipients were members of the planning profession 
who had been known to URBANDOC since the original pilot project, plus a great many 
librarians and others active in urban planning and renewal. 

As indicated in Chapter III, "Goals: Evolution and Resolution" the response resulted in 
many useful changes to the working copy of the Thesaurus. It also resulted in the 
Thesarus acquiring something of a reputation as a guide for specialized libraries, and the 
many subsequent requests for the Thesaurus had to be referred to the Clearinghouse for 
Federal Technical and Scientific Information, which made it available in both full-size 
hard copy and microfiche. 

Publications 

The next instance of field testing was similarly informal. It consisted of distributing early 
versions of the Input Index to a mailing list of approximately eight hundred individuals 
and asking for informal feedback. The indexes were numbered 1 (December 1967) and 2 
(January 1968) respectively. The first was based on 300 documents, and the second on 
340 documents. In both cases the specialized listings included arrangements by personal 
authors, corporate authors, titles, subjects, and consultants. 
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The distribution list included directors of local housing and renewal agencies and 
municipal, county, and state planning agencies who had added the project to their own 
mailing lists for document distribution (see Chapter IV), plus recipients of the 
UR BAN DOC Thesaurus. The feedback on these issues, which had been produced very 
inexpensively, revealed the importance of aesthetics in bibliographic products. 

The third field test in the spring of 1968 used more formal reproduction techniques and 
received wider distribution (four thousand copies). The American Institute of Planners 
made available its entire membership list for URBANDOC's use. A covering letter from 
the executive director of AlP introduced the test issues and encouraged response to a 
questionnaire that had been prepared by URBANDOC in consultation with the institute. 

Several factors were being studied. One was a comparison of the two versions of the 
fndex, one of which was a cumulation and rearrangement of the bibliographic citations in 
issues 1 and 2; the other presented new citations as well as alternative arrangements as 
issue 3. 

About 10 percent of the recipients responded to the questionnaire - a small but 
acceptable figure. The nature of the responses indicated that URBANDOC would have to 
make further improvements in the Input Index before it could function effectively as an 
index journal for this audience. 

The improvements were implemented — as far as practical — for issue 4. That was the last 
of the field tests for the Input !ndex\ the issue was produced and distrubuted in January 
1969. Some of them went to respondents of previous field tests, in order to obtain a 
continuing evaluation of both graphics and contents. Additional copies were sent to 
individuals who had learned of the earlier issues, particularly librarians and faculty 
members at colleges and universities. The questionnaire accompanying this field-test issue 
was partially comparative and partially new. 

The response from issue 4 resulted in the prototype Input Index contained in Appendix B 
of this volume. In the following sections of this chapter, additional details are given about 
the resolution of the problems of aesthetics, reproduction techniques, and user response 
to x\\e Input Index. 

Retrieval 

The last of the three major products to receive field-test treatment was the Retrieval 
Report, As previously indicated, machine retrievals had been produced in the 
course of development. Search statements were suggested by the staff and members of 
the National Advisory Council to test various responses of the computer programming. 
These were all ''retrospective'* searches in that the search was made of the entire file, 
however large it was at the moment. The term "retrieval*' ordinarily refers to this kind of 
use of the system, although the retrieval capability can also be applied just to the current 
input, in which case the product is called some for of selected dissemination of 
Information or SDI. 

In the SDI variation of retrieval, each user has submitted a "profile,'* expressed in 
descriptor terminology, which constitutes a standing order or query. At regular intervals, 
the group of profiles is processed against the new input, the purpose of the product being 
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to alert the user to incoming materials in his field(s) of Interest. The individual profiles 
may be either general or sptn;lric, and may be adjusted from time to time. It is also 
feasible ^ and in fact economical ^ to have group profiles, each one serving several users 
with similar interests.* 

in view of the growing interest in SDi, URBANDOC decided to make its public field test 
of retrieval take that form rather than retrospective searching, The opportunity was 
provided by the 1969 annual meeting of the American Society of Planning Officials, 
which had invited the project to organize a formal session by way of a report to that 
segment of the urban profession. After examining a preliminary copy of the conference 
program, the staff formulated eighteen search statements, each one reflec^'iig a particular 
aspect of the program, which were processed against a small file of current input. 

A questionnaire enclosed with these Retrieval Reports solicited comments by ASPO 
members and visitors attending the session. The response was good, but it was not 
conclusive; the audience appeared to be more at ease with computer-produced 
information than might be expected of a more general audience. 

The Issues 

Aesthetics 

Issues 1 and 2 of the Index were printed directly onto continuous-form multilith mats, 
which were used to reproduce the issues. While this was an inexpensive and convenient 
way of reproducing printouts, few users seemed willing to trade less formal appearances 
for the economies in production. Later issues were produced by reducing and photo 
offseting the print-out reports. Although many recipients reported on the difficulty of 
reading computer-produced indexes with substantial reductions In print size, the 18 
percent reduction in Index No. 4 was found to be satisfactory. 

Many of the adverse comments about the physical appearance of the earliest issues of the 
Index were solved by revisions of the computer programs. These included greater use of 
indentation to distinguish parts of the main document listing, making the location of page 
numbers for sections of the Index uniform, and the addition of running heads on each 
page. Use of colored paper in issue 4 to distinguish the major sections of the Index 
resulted in substantial user approval. Further refinements in the display of the record are 
made in the prototype. The cost of increasing the aesthetic appeal by these changes was 
substantial, but they represented one of the issues in determining the commercial viability 
of the product. 

The URBANDOC elution to the aesthetics problems of the final hard copy is quite 
different in basic approach from those documentation centers that produce indexes that 
do not even look like printouts. In the URBANDOC publications module, the output is 
the printout iUjIf. In more sophisticated (and expensive) systems, the output is magnetic 
tape, which goes through further computer processing before it emerges as traditional- 
looking type, complete with right-hand justification, upper-and lower-case letters, and 
different fon^. Although the URBANDOC product may seem simple by comparison, it Is 
economically sounder for the projects most likely to be undertaken in any local or 
regional implementation. 
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In the case of retrieval, aesthetic considerations appear to be less Important, but not 
negligible. Prior to the last quarter of 1969, URBANDOC used the standard report format 
that came with the Combined File Search System. Although somewhat difficult to read, 
the programming resourc ;s of the project could not support the extensive reprogramming 
that would have been required in order to produce a more readable product. The 
users did not complain; neither did they rush requests for more searches. To approach 
a wider audience seemed unwise until the situation could be remedied. This was done 
when URBANDOC found — and arranged to uje — a set of programs called "'Search 
Expansion"" from the Engineering Index, They produce the format illustrated in the 
appended Retrieval Report, It is not only readable, but also easily handled by standard 
office copying equipment. 

Presentation 

There are difficulties in presenting bibliographic information in ways that are both 
economical and understandable to nonbibliographers. All the feedback indicates that 
readers prefer to have indexes arranged chiefly by subject and to have each citation as full 
as possible. Unfortunately, this becomes economically unfeasible since each book or 
article is cited many times in the typical index, appearing under several different subjects 
as well as in author and other specialii.ed listings. (For a fuller discussion of the 
bibliographic records, see the General Manual h 

Publication costs dictated a simpler solution, preferably one in which each bibliographic 
record would be presented in full only once. All other citations would use a number to 
refer the reader to the full citation. URBANDOC studied the field*test results to try to 
achieve a compromise between strict economics and complete user satisfaction. The Main 
Subject Listing in Appendix B reflects the compromise. 

Input Index (as Appended) 

The citations in this first section of the prototype Index are arranged by this subject 
heading, the latter in alphabetical sequence. The citations themselves contain only the 
primary author, the title, and the document number. If the reader is interested in such 
further bibliographic information as imprint, publisher, abstract, or other added entries, 
he must turn to the second section of the Index, the Main Document Listing. The 
arrangement is in document number sequence. (The document number itself is explained 
in detail in the Genera! Manual.) 

The third section of the Input Index contains the various specialized Indexes: personal 
author, corporate author, title, project number, place, consultant. Those listings are even 
more economical than the Main Subject one, containing only the access point in 
alphabetical sequence and the document number. The specialized indexes are therefore 
completely dependent upon the Main Document Listing. This necessity to go from one 
listing to another for complete bibliographic information is called the "double look-up." 
It is a feature common to many indexes, for the same reason of economy. 

Retrieval Report (as Appended) 

In the Retrieval Report, each citation is listed once, in its entirety. There is no ""double 
look-up"" outside of the necessity to check the full name of periodical titles. Even these 
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are quickly learned, especially the ones that are frequently retrieved. The only other 
"translation" problem in the citation concerns the geographic descriptors, since they 
appear as numeric codes rather than natural language place names. A few miscellaneous 
bits of geographic information (Standard Metropolitan Statistical Area and city size) 
would require consultation with external tables, but the main place names reappear later 
in the citation in natural language. In practice, the presence of coded information for the 
geographic part of the document analysis does not interfere with the readibility of the 
Retrieval Report, 

A little more disconcerting is the fact that each citation starts with the descriptors and 
proceeds to the author, title, and other usual elements of bibliographic information, 
Although the reverse order seems more natural, there are advantages to a display format 
which shows the reasons for retrieving a particular document before the document itself 
is described bibliographically. In any case, URBANDOC had no option but to accept the 
present format for demonstration purposes. 

If the retrieval capability were being used ^operationally for selective dissemination of 
information, the pressure to reformat the report would be greater. Many other systems 
produce their SDI reports on cards, one citation per card. The cards are designed for easy 
mailing as well as easy filing by the recipients. The necessary programs could easily 
interface with the Combined File Search, thus making the results of SDI searches in the 
URBANDOC system a different-looking product. 

Another possiblity for using the retrieval capability consists of specialized bibliographies. 
URBANDOC developed its own retrieval-publications interface which enables the 
computer to search the data base retrospectively, produce output on a reel of magnetic 
tape, and then rearrange the output into the formats of the Input index, URBANDOC 
assigned a tentative name to this product — Subject Series — but did not produce 
field-test issues. 

Optional Elements of Information 

Whereas the issues on presentation are the results of decisions at the time of output, the 
issues in this section also become involved in the input. If all the needs of all potential 
users were accommodated, each record would contain funding information, abstracts, 
project numbers, ordering directions, and many other desiderata. The field test helped to 
indicate consumer reactions to these possible features. 

The prototype Input Index, in addition to incorporating many minor improvements over 
the previous versions, added one completely new feature. In response to requests during 
the field test, the ordering information is included in the Main Document Listing. On the 
other hand, the prototype retains one feature that did not gain wide consumer 
acceptance: the project number index, restricted at the moment to projects under the 
Urban Planning Assistance Program. URBANDOC feels that this feature should not be 
discontinued without further exploration. Further testing can be done, since it is the least 
expensive of the optional features. 

The most costly of the options is the abstract, which In many other documentation 
services is basic to the entire index. Although abstracting was not a primary URBANDOC 
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responsibility, the project took advantage of the opportunity to include this kind of 
analysis when an abstract was provided with the Urban Planning Assistance Program 
reports. Since the field test Indicates consumer acceptance, the project felt it worthwhile 
to experiment with the agency-provided abstracts. The results^ much more compact than 
the originals, are included for many of references in the Main Document Listings in 
Appendix B. 

Mix of the Document Base 

Composition of the document mix in a bibliographic system is also an issue. URBANDOC 
had envisioned the document base as being composed primarily of reports and periodical 
articles of a professional level. During the development of the computer system, some of 
the inpu; to the document base was designated for retrieval only, since the items were 
believed to be of limited interest from the standpoint of a particular product, e.g., 
newsletters of a house organ nature. 

Within the base considered for publication use, the document mix was maintained at 
roughly 5 percent monographs, 70 percent documents, and 25 percent periodical articles. 
The consequences of this policy were twofold. First, many recipients of the Index, as well 
as other people, were unaware that the listing in the Index issue was less than the total 
input. Second, the scope of the mix was frequently questioned by potential users, 
depending upon their special interests. 

The project received suggestions to include several other types of materials, such as 
articles from popular and semipopular journals. Therefore, the first large field-test 
questionnaire contained questions eliciting possible interest for each type. The response 
was 52.7 percent for including materials from popular journals like /Vewswee/r, and 65.3 
percent for such semipopular ones as Fortune. This was a surprising result, since the bulk 
of the field test had been with members of the American Institute of Planners, Similarly 
unexpected interest was shown in the possibility of a book-review index: 56.7 percent 
responded favorably. 

The potential effect of extending coverage to popular and semipopular journals on a 
continuing URBANDOC-type system is manifold, especially for the input economics. 
Such articles are easy to acquire and analyze. A substantial increase in that part of the 
data base would dramatically reduce over-all unit costs. However, these materials are 
already covered by many commercial indexes, and the essence of URBANDOC was to 
demonstrate automated bibliographic controls for the more difficult materials. Obviously 
many trade-offs will bear consideration before there can be a final decision as to the 
composition and mix of the data base. 



The Respondents 

Majority Practitioners 

A good measure of user acceptability and interest in the Input Index as a product was 
supplied by the response to two questionnaires for the last two field-test issues. Of the 10 
percent responses, approximately 70 percent were from practitioners, both planners 
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working with public agencies and consultants. The remainder (30 percent) were from 
persons closely associated with planning and urban development, either as members of 
academic institutions, planning librarians, or as documentalists and others interested in 
library automation. During analysis of the questionnaires a further analysis of the 
practitioners' responses was made as to functional interests, e.g., those concerned with 
management and administration vis a vis those primarily concerned with research and 
analysis, including information systems. 

Influence of Functional Responsibilities 

Significant variations in the needs and interest levels of the six categories of respondents 
emerged as the questionnaires were analyzed. Upon comparison, a pattern was discernible 
on almost every item covered in the two questionnaires. Planning agency management 
respondents were most interested in the Input Index as an information resource, but did 
not particularly see themselves using it as extensively as other members of their staff. 
Using the percentages as an indication of interest, the planning agency management group 
replies showed a consistently lower level of interest in many aspects of the Index 
mentioned in the questionnaires than did the other categories. 

Conversely, the four other major planning information users — academia, planning agency 
research analysts, consultants, and librarians — expressed interest in the Input Index 
issues as a working tool. Their answers to questions regarding the scope and format of the 
indexes, the bibliographic policy and even methodology reflected interest in the potential 
usefulness of the Index as a reference and current awareness tool. The pattern of 
responses by planning librarians also indicated the constraints they have on their 
collections and their acquisition policies for types of materials. 

While the questionnaire return summary tabulations show the rising level of satisfaction 
with the Input Index by all categories of respondents, some of the differences can best be 
pinpointed from detailed breakdowns. 

Resource Versus Reference User Attitudes 

While 69.7 percent of the total respondents to questionnaire 1 (evaluating the cumulated 
index issue and Index No. 3) were satisfied with the preface and introductory materials, 
both planning agency management and planning consultants found the materials less 
satisfactory (66 percent for each category). The lower satisfaction of the management 
category continued at the same level (66 percent) in evaluating the introduction of Index 
No, 4. The consultant category, however, found improvement (90 percent satisfied). Both 
categories were less satisfied with the table of content presentation in Input Inaex No. 4 
than the total respondents (75.8 percent, 80 percent, and 81.3 percent respectively). 

In the first questionnaire, respondents were asked about their interest in several 
additional proposed indexes. The geographic index was one of the indexes added for 
Index No. 4. In the second questionnaire, respondents were asked if they found the 
geographic index included in Index No. 4 useful. A comparison of responses by categories 
to the two questions not only indicated that the five major categories found the index 
more useful than anticipated but reflects a typical measurement of usefulness for 
individual categories: 
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Questionnaire #1 
% of yes 


Questionnaire #2 
% of yes 


Respondent Category 


68.8 


92.0 


Academic 


68.4 


81.0 


Planners — Research 


67.7 


95.0 


Consultants 


63.9 


72.5 


Planners ~ Management 


70.4 


77.2 


Librarians 



Survey Summary 

Accurate profiles of the various user group needs certainly cannot be derived from the 
responses to such informal market research tools as the two questionnaires, but broad 
outlines and various areas of overlapping interests do become apparent. For the 
statistically oriented, the total return for each of the questionnaires was approximately 
10 percent of the mailing; only usable returns were tabulated (less than 1 percent were 
not usable because of missing source identification or incompleteness). All quantification 
is expressed in percentages of the total usable returns. Both questionnaires used either 
and '"no" or "satisfactory" and "unsatisfactory." Space for comments was 
provided. The analysis does not include comments; "satisfactory" was treated as a "yes" 
answer and "unsatisfactory" as a "no" answer. 
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Summary of User Response to Questionaire Sent with Input Index 3 
and Cumulative Edition of index f and 2 

Total Respondent Replies 
by Percentages 





Question 

Re: Input Index as a Whofe 


Yes 


No 


No Answer 


1. 


Format-satisfactory? 


78.5 


16.1 


5.4 


2. 


Contents— satisfactory? 


78.9 


14.2 


6.9 


3. 


Sequence of Indexes— satisfactory? 


74.7 


17.4 


7.9 


4. 


Prefaces and Introduction— satisfactory? 
Re: Document Listing 


69.7 


22.1 


8.2 


5. 

6. 


Is this listing comprehensible? 

Do you prefer associated author information 


83.3 


12.3 


4.4 


7. 


here? 

Do you want federal project numbers 


48.6 


42.3 


9.1 




in this main bibliographic entry? 
Re: Special indexes 


52.7 


34.4 


12.9 


8. 


Personal author— satisfactory? 


85.2 


11.0 


3.8 


9. 


Corporate names— satisfactory? 


79.8 


12.9 


7.3 


10. 


Subject— satisfactory? 


82.6 


10.7 


6.7 


11. 


Significant title-satisfactory? 


81.7 


12.6 


5.7 


12. 


Consultants— satisfactory? 
Re: Subject index 


78.2 


12.3 


9.5 


13. 

14. 


Are the subject headings adequate? 

If you are familiar with the URBANDOC 
Thesaurus, would you use it as a supplemental 


81.0 


11.7 


7.3 




tool for finding cross references? 
Re: Proposed Additional indexes 


49.5 


143 


35.7 


15. 


Geographic Index —lntere*‘'.ad in? 


66.2 


24.6 


9.2 


16. 


Book Review Index— interested in? 
Re: Frequency of issue 


56.8 


27.5 


12.7 


17. 


Every six weeks— satisfactory? 


64.0 


18.0 


18.0 


18. 


Cumulated bi— annually— satisfactory? 
Re: Input Index 3 only 


74.1 


18.3 


7.6 


19. 

20. 


Do you like the title and subject combination? 
Do you like the corporate author and 


68.8 


24.6 


6.6 


21. 


consultant combination? 

Do you find it possible to cope with the 
truncated titles and corporate names? 
[The combination indexes gave citations 


67.8 


22.1 


10.1 




by document numbers only.] 
Re: Layout and Print Size 


72.9 


20.5 


6.6 


22. 


Print size for Issue No. 3-satisfactory? 


70.3 


23.7 


6.0 


23. 


Print size and format for Cumulative Issue? 


79.4 


13.4 


7.2 


24. 


Are these types of formats easy to use? 
[Index 3 was bound across the top.] 


73.5 


17.0 


9.5 
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Summary of User Response to Questionnaire Sent with Input index 3 
and Cumulative Edition of index 1 and 2 

(continued) 

Total Respondent Replies 







by Percentages 




Question 

Re: Ordering information 


Yes 


No 


No Answer 


25. 


Can you obtain most documents analyzed 
within your agency or from others in your area? 
Re: Scope 


54.6 


29.6 


15.8 


26. 


Would you like the Input Index to include 
pertinent articles from popular journals, 
e.g., Newsweek? 


52.7 


39.4 


7.9 


27. 


Semipopular journals, e.g. Fortune? 


65.3 


24.9 


9.8 


28. 


More foreign documents? 


46.7 


43.8 


9.5 


29. 


Greater attention to graphic materials? 
Re: Browsing Procedures 


48.0 


37.5 


14.5 


30. 


Do you browse by author first? 


32.5 


63.1 


4.4 




(a) % of ''Yes" cited corporate author 


(58.2) 








(b) % of "Yes" cited personal author 


(24.2) 




17.5 


31. 


Of all indexes, which do you use first? 










(a) % cited subject index 


63.1 








(b) % cited significant title 


9.1 




27.5 


32. 


Of all indexes, which do you use second? 










(a) cited an author index 


18.3 








(b) cited subject index 


5.4 




43.9 




(c) cited significant title 


20.2 






33. 


If we add an abbreviated Document Listing, 
sorted by broad subject headings, what 
headings would you use-give 5-8 examples? 


37.5 




62.5 



[Because this was a fill-in question, 
tabulation was made of those responding 
as follows:} 

(a) subject examples (87.4) 

(b) political jurisdiction examples (12.6) 
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Sent with input Index 4 



Product Development 



Total Respondent Replies 
by Percentages 





Question 


Ves 


No 


No Answer 


1. 


Re: General Performance 

Is x\\B Input Index a reasonably satisfactory prototype 
of a computer-generated indexing tool? 


91.8 


4.6 


4.6 


2. 


Do the listings \x\ Input Index 4 represent the kinds of 
indexes and materials you would like to see and can use? 


87.2 


8.2 


4.6 


3. 


Re: Format and Arrangement 
Introduction-satisfactory? 


80.2 


13.0 


6.8 


4. 


Table of Contents-satisfactory? 


81.3 


13.0 


5.7 


5. 


Sequence of sections and subsets-satisfactory? 


86.1 


8.2 


5.7 


6. 


Page headii js and color paper-satisfactory? 


87.2 


8.2 


4.6 


7. 


Re: Major Subjectdndex 

Can you tolerate the abbreviated bibliographic form 
(author, title), In this particular listing as an economic 
compromise between full information and mere 
reference to a document number as in previous issues? 


86.1 


8.2 


5.7 


8. 


Do the subject headings (left column) In this listing 
reflect present and/or anticipated information needs 
adequately? ' 


73^ 


15.1 


11.6 


9. 


Would you be interested in participating in rftrieval 
tests using the more complete subject analysis which is 
in the computer system? 


65.2 


25.5 


9.3 


10. 


Re: Document Listing 

Can you tolerate the omission of associated authors and 
project names In this listing [this was suggested by 
respondents to previous questionnaires] ? 


81.4 


12.7 


6.9 


11. 


Do you like the Inclusion of abstracts with the documents 
that are associated with the Urban Planning Assistance 
Reports (701)? 


72.1 


13.9 


15.0 


12. 


Is the arrangement of document records into these 
subsets with page titles useful? 


74.0 


11.0 


15.0 


13. 


Re: Special Indexes 

Urban Planning Assistance Program Projecti-useful? 


72.0 


11.0 


17.0 


14. 


Geographic (place names in document)-useful? 


79.1 


8.2 


12.7 


15. 


Statutory CItations-useful? 


67.2 


!1.0 


20.8 


16. 


Re: Future Possible Services 
Would you be interested in having a subscription 
service offer the option of copies of the documents 
themselves on suitable microform at nominal additional 
costs? [This would apply only to materials that are not 
subject to copyright.) 


65.1 


25.5 


9.4 
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Summary of User Response to Questionnaire 
Sent with Input Index 4 

(continued) Total Respondent Replies 

by Percentages 

No 

Question No Answer 

17. Would you be interested in having a subscription 

service include news about the operation of the 
URBANDOC information system and suggestions for local 
implementation? 65.0 24.0 11.0 

18. Would you be interested in a subscription service that 
included listings of actual computer systems and 

programs in urban information that may be available? 69.4 22.0 48.6 

19. A number of U.S. radio and television stations produce 
programs relating to city planning and urban development 
activities in the areas they serve. Would you be interested 
in having bibliographic information on such programs 

included in the Input Index? 59.3 28.4 12.3 
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DATA PROCESSING 

Although the use of computers was at all times a central element in Project URBANDOC, 
the focus was on the application of existing techniques rather than on the development of 
new ones. Chapter III indicates the relationship between the computer programs and the 
attainment of the original project goals. 

A great deal, of course, remains to be said about the machine processes by which the 
analysis of the document base became a functioning data base. Some of the discussion 
relates to technical topics and some to administrative issues. The former has, with the 
advice and consent of HUD, been largely reserved for the Genera! Manual and the 
Operations Manual, However, a general description of the programming system is 
appropriate for several reasons. First, few '"generalists" can escape a data*processing 
environment so engulfing that machine model numbers are entering the vernacular; and 
second, the introductory remarks about the system are essential background to a 
consideration of the administrative issues that are properly part of this report. 



Introduction 

The Programs 

The total system is a composite of programs from several sources. The core of the system 
is the Combined File Search System from the IBM 1401 Program Library. The majority 
of the remaining programs and subsystems were completely developed and implemented 
by the URBANDOC staff. There is also the set of programs obtained from the 
Engineering Index, The basic programming languages are AUTOCODER and COBOL. 

For systems development purposes there are five modules, or groupings of programs, that 
perform the various processing operations. The Thesaurus Module maintains the 
Thesaurus, both in machine-readable and hard-copy form. The Pre edit Module formats 
the input data and edits and lists it for proofreading and correction by the document 
analysts. The Search Module performs the computer search and prints the Retrieval 
Report The Publications Module maintains the files for the index journal and prints as 
the Input index. The Pre-edit Module and the Publications Module were wholly 
developed by URBANDOC. The other modules consist of programs from the various 
sources. When considering the system for operations purposes, there are also processing 
cycles, or a different regrouping of the programs. 



The Equipment 

After the analysis of a document was completed, the worksheets were used to create 
machine-readable input. For the most part, URBANDOC used a keypunch on the 
project"s premises to create punched card files. The card files were then transported to 
the Computer Center of Baruch College for the actual machine processing. 

The system used in the URBANDOC demonstration was implemented on the IBM 1401 
computer. URBANDOC used the full capacity of the Baruch College installation, which 
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has twelve thousand positions of memory, card reader, card punch, printer, four tape 
drives and two disk drives. (Actually, this system could be run on an IBM 1401 with eight 
thousand positions of memory, card reader, card punch, printer, and four tape drives.) 

The project's own systems staff operated the computer. The files were maintained on 
magnetic tapes and the programs on punched cards. Both the data files and the program 
files were stored and maintained at the Computer Center. The printed results of 
processing — both preliminary listings and the final products — were transported back to 
the project premises. 

In the absence of equipment for setting type, the printouts for the Input Index were 
photo*reduced for final reproduction ^ v offset reproducing equipment. Both Graduate 
Center and commercial facilities were J for these post-computer processes. 

Problem Areas 

A study of URBANDOC could evaluate each activity as a success or a failure. The project 
feels It would be meaningful, however, to consider problem areas (instead of failures) 
since these are bound to recur In future documentation efforts in the urban areas. 

If the problem areas are identified with the activities that give rise to them, then they can 
be divided into four broad categories: pre-document analysis, document analysis, data 
processing, and product development. The first cateijory has already been discussed in 
Chapter IV, "The Document Base," and the second in Chapter 111, "Goals." The third 
category, data processing comprises the present chapter. Other problem areas emerged in 
Chapter V, "Product Development." 

When discussing difficulties in data processing, the differentiation must be made 
between systems considerations and administrative considerations. As pointed out earlier, 
the emphasis here is administrative. The following topics, while not common to all 
documentation efforts, are characteristic of those using data processing. 

Data Entry 

The first problem area is data entry, or the creation of the information in a form directly 
usable by the computer. Computers cannot yet, either economically or technically, 
directly accept through optical scanning techniques the text of large volumes of written 
or printed matter. The bibliographic records created by the document analysts must 
therefore be converted into some kind of machine-readable form, either by keypunching 
or by some other device. 

Keypunching is the most widely used method. However, the logistics involved In 
transporting large volumes of cards to and from off -site computer facilities created 
something of a problem. In an attempt to alleviate it, URBANDOC investigated devices 
that would enable the document analysis to be encoded directly onto magnetic tape, 
eliminating the creation of targe card files. DAT AT EXT, the system that appeared to be 
vthe solution, was a version of IBM's Administrative Terminal System, available on a 
commercial basis. URBANDOC tested DATATEXT for almost a year. 

The equipment consisted of a typewriter-like terminal. It was installed at the project's 
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office and was connected by telephone to an IBM 1460 computer dedicated to certain 
record creation and correction functions. The keying operations were performed by a 
typist who had received several days of specialized training on the terminal. The typist 
operated a keyboard and sent the bibliographic input records over a telephone wire to an 
IBM 1460 computer. The input was then written onto a magnetic tape which would be 
used to update the URBAN DOC file. These tapes, machine- readable Input for future 
processing, could be produced by the commercial service on short notice and delivered to 
the project offices for redelivery to the Baruch College computer facility. They would be 
substitutes for punched-card input to the URBANDOC system. Or so It seemed. 

Unfortunately, many problems arose from the intervention of the additional off site 
facilities. For both administrative and economical reasons, the project returned to 
keypunching for alt data entry, at least for second-generation computers (the 1400 
series). Future documentation efforts will undoubtedly reopen the entire issue. Whether 
using computers in the same building or elsewhere, they are most likely to resort to some 
kind of terminal, possibly one not in existence at the time URBANDOC was investigating 
alternative data entry devices. Even now, the project feels that procedures such as those 
discussed in Chapter VIII, "'Conclusions and Caveats" might have yielded greater success. 

Data entry problems are not restricted to those involved in making the proper choice of 
mechanical devices. It is most important that no part of the entire operation be 
unnecessarily repetitious, which can be the case with data entry. Such a situation arises 
when the document analysts complete a worksheet and pass it on to the keypunch 
operator or terminal operator. Although the initial creation of the worksheet is not data 
entry as defined by data-processing terminology, it is in fact initiating the input. If all or 
part of documentation analysis can be done originally in machine-readable form (called 
source data entry), then economics would seem obvious. During the DATATEXT trial, 
the typist was also trained for simple descriptive analysis, the kind performed on 
periodical articles. The amount of repetition was therefore limited to the content analysis, 
which was performed first by the professionals. This part of the experiment was quite 
successful and could well have been extended to more complicated materials. 

Although theoretically the same approach to data entry could be implemented on the 
keypunch, the nature of the URBANDOC materials militates against it. The descriptive 
analysis of the major documents Is so difficult that the original entry must create hard 
copy that can be immediately reviewed. With DATATEXT and with some forms of 
terminals, the operator creates such hard copy when she types. With the keypunch and 
other terminals, this is not necessarily the case. There may be no readable record of the 
entries for some time. Revising and correcting can then be so difficult as to override the 
advantages of the single entry of data. 



Editing and Validating 

Data entry is only the first step in transferring a batch of bibliographic records into a 
useful data base. After that there is a great deal of computer processing, some of which 
will disclose entries that require revision. Some of these represent keypunching errors — 
misspellings, omissions, etc. — while others are more in the nature of editorial decisions. 
Developing an effective method of handling these revisions can be a problem area. 



56 



^ : 



49 



i 



At the start it seemed natural to check for errors at every point along the way and to 
make the changes as soon as the error became apparent. As a result, there were too many 
points at which proofreading, corrections, and data re-entry occurred — all expensive and 
somewhat chaotic. Correcting this situation required combining many human and 
machine capabilities. The details of the solution are presented in the manuals under 
"'Editing and Validations." 

The principles of the solution are of more general interest. All computer operations 
should be performed prior to error detection and correction. That way, all of the errors 
can be corrected at the same time, both those discovered by human editing and those 
found by computer validation procedures. One massive correction cycle can handle the 
entire problem. 

A second basic principle is that access to the document mcster file must be carefully 
regulated. Unilateral decisions by individual members to revise a record already on the file 
can have unforeseen consequences. This will be even more important when there are 
several terminals on line to the computer. 

i 

Systems Support 

Adequate systems support, fundamental to every computer-aided operation, frequently 
becomes a problem area. What seems to be adequate support in the beginning turns out to 
be less so, experience turning up the necessity for expansion and modification of the 
system. For UR BAN DOC, additional support was inevitable. The original project design 
envisioned the acquisition of an already developed computer system requiring little in the 
way of further development or modification. Preliminary investigations, made at the time 
the demonstration proposal was being developed, had indicated that the Combined File 
Search (CFS) met these specifications. 

The basic system was contributed to the IBM Program Library by the Service Bureau 
Corporation. It was not an IBM-supported set of programs. Experience revealed that 
considerable additional systems design and programming would be necessary, a 
requirement taxing to a project whose initial systems resources were limited to one 
person. Various devices, such as a cooperative arrangement with other users of the same 
system, overcame this deficiency. However, these same arrangements are not re- 
commended as a guide for other documentations efforts in the field. \ 

The second-generation systems experience of URBAN DOC do'es not, however, indicate 
that conversion to the third generation need be an all-out expensive operation in terms of 
systems development. Three general possibilities are open to the user. Firstly, the use of a 
programming language oriented toward machine independence makes the conversion 
process a considerably simpler affair. For example, some of the URBANDOC programs 
are written in COBOL. Secondly, the existence of machine emulators and compatibility 
devices Isuch as those existing for the IBM 360) would allow the operation of the existing 
second-generation system on third-generation equipment Thirdly, the idea of using a 
programming system from a manufacturer's programming library is still valid with the 
proper systems support. Some manufacturers provide support for a system that they have 
developed. Other manufacturers distribute contributed systems (those developed by 
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others). In some cases, technical support must be provided by the individual installation. 
In other cases, technical support may be leased either from the manufacturer or from a 
consulting firm. 

Programming Languages 

In order to understand the problem arising from a choice of programming language, it is 
helpful to review the nature of programming languages themselves. 

A programming language is a system designed to translate the instructions that a 
programmer writes into a machine language that a computer interprets and executes. 
Among programming languages, there are various levels of translators. Those very close to 
machine language are called assemblers. Generally, one programmer's statement becomes 
one machine instruction. Higher-level languages, compilers, convert one programmer 
statement into multiple machine instruction or perhaps an entire sequence of operations. 

When selecting a programming language, the trade-offs between assemblers and compilers 
must be considered. In general, it is easier and faster to train a programmer in a compiler 
than in an assembler. Compilers are comparatively machine-independent; that is, a 
program written in a compiler can be transferred to another computer with little 
modification to the programmer's coding. However, compilers are usually less efficient in 
terms of the core storage required to store the program and in the processing time 
required to run the program. 

The programs in the Combined File Search System were written in AUTOCODER, an 
assembler language oriented specifically toward the IBM 1401 computer. When 
URBANDOC enlarged the basic system to include publications and other capabilities, it 
seemed logical to continue writing in the same programming language. This policy did 
have drawbacks. First, since the language is specifically oriented to ward a particular piece 
of equipment, the resulting programs are not easily transferrable to other computers, let 
alone other manufacturers'. This became more serious when the government adopted the 
position that its computer work should be manufacturer-independent. Second, pro- 
grammers for second-generation equipment were oriented toward writing in assembler- 
type languages. Programmers who enter the field after the advent of the IBM 360 and 
other third-generation equipment are not extensively trained to write in these languages. 
URBANDOC discovered this while enlarging the systenns staff and had a choice of 
in-service training in AUTOCODER or shifting to another programming language. 



Both these problems were resolved by programming in a compiler called COBOL. This 
move also had the advantage of facilitating any conversion from second-to third- 
generation computing. Once the programs were written in 1401 COBOL, they could, with 
necessary precautions, be easily converted for another system. However, there were 
disadvantages in operating in a COBOL environment. The time required to execute a job 
with a program in COBOL will be approximately double the time required by a 
comparable AUTOCODER program. Although this condition will still hold for the 360, it 
is anticipated that the faster speeds of the third-generation computer will make the 
greater execution time less onerous. 
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Equipment Independence 



During the first year and a half the project used a computing facility outside the 
university. When Baruch College acquired the necessary equipment, URBAN DOC 
transferred its computer operations to that installation and the data was designed for the 
most common 1401 configuration: four tape drives, eight thousand positions of memory, 
card reader, card punch, and printer. It fitted well within the configuration of the Baruch 
College installation, which contained additional features later utilized to expand the 
system. Had it been necessary to effect another transfer, the only problem might have 
been to cut back on some of the expansions. 

The same situation is not expected for newer equipment. Third-generation computers 
require operating systems that control the operation of individual working computer 
programs for each individual installation. The typical computer manufacturer-provided 
operating system now requires more computer memory (and all that goes with it) than 
their more primitive second-generation predecessors. In addition there are more 
alternatives in selecting the input-output devices for each installation. As a result, there 
will be fewer computer installations with the exact configuration of equipment to run 
one particular set of programs. 

The set of programs to perform such jobs as payroll, accounts receivable or information 
retrieval are called '"processor programs" because they perform the actual manipulation 
of the data. On second-generation equipment like the 1401, the user needed only the 
processor programs. He could load his programs into the computer memory and run his 
job. On third-generation systems like the 360, there must be an intervening level of 
programs between the computer and the processor program. These programs, called 
"'operating systems," are control programs that monitor the actual running of the 
processor programs. The operating system loads the programs into computer memory, 
provides for steady transition from one job to the next, controls input and output 
operations, records computer time used by each job. etc. 

To compound the equipment independence issue, the individual installation will have a 
choice of several operating systems to use in its own facility. Besides locating the correct 
configuration, the user will have to find one with the correct operating systems 
environment. 

Of less significance but worth mentioning is the secondary storage used for data files. The 
second-generation system stored its files on magnetic tapes (the most easily transportable 
of all media). Third-generation equipment uses direct-access storage ("disk," ""drums," or 
"'data-cell" devices), which are less amenable to such an arrangement. 

The entire third -generation environment points to a situation where an URBANDOC-type 
project would be dependent on one computer for its main processing. A smaller auxiliary 
configuration could be used for certain peripheral operations, such as card-to-tape and 
report-printing functions. The amount of access and storage as well as the testing 
conditions will all require careful consideration. 

Production Schedules 

The gap between experimentation and operations can be substantial, particularly if it 
involves data processing. UR BANDOG'S early experience indicated that complete success 
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as a demonstration had to Include the attainment of a production-oriented environment. 
It would not be sufficient to accomplish only the specified tasks or produce sample 
products. Some of the rigors of a production schedule had to be experienced for a report 
on the demonstration project to be useful to an ongoing operation. 

Field-testing several issues of the input Index satisfied the requirement for production 
experience. Several individual cycles of activities were geared toward producing an Index 
within a fixed time frame. Although the data bases were smaller than operational ones 
would be, the project did establish schedules for various activities and meet deadlines. 
The staff felt, by the time Input Index No. 4 had been distributed, that it could be 
published on a six-week cycle. This would include lead time for unexpected problems. 

These findings depended upon the correct determination of the necessary lead times for 
the various operations, somewhat longer because of off-site computing. Scheduling also 
had to include thesaurus updating, document master file updating, retrieval, and testing 
of new programs. More computer time or fewer activities not directly related to the Index 
would, of course shorten the publication cycle. 

The staff hoped that the production environment would expand to include machine 
searching on a regularly scheduled basis. The greatest challenge to the establishment of 
practical schedules appeared to be the introduction of search runs into the processing 
cycle, since several kinds of search services could be offered. 

If selective dissemination of information was developed, the searches on user profiles 
would be run midway between the first processing of a new batch of Input and the 
processing of revisions to that input. Such a schedule would guarantee a quick report. 
Unfortunately, it would be a somewhat “dirty" report, being based on uncorrected data. 
The idea was to he ve a product several weeks ahead of a regular Input Index issue. Errors 
might be acceptable in the SDI reports if the user knew that corrections would be made 
before the data appeared in more permanent form. 

If retrospective searching services were also added, then processing scheduling becomes 
even more complicated. The processing of retrospective search requests, made against the 
document master file of records, are separate from the processing of SDI user profiles, 
since different data bases are used. The former generally involve the entire bibliographic 
file while the SDI search involves only the newest input. Decisions would have to be made 
whether retrospective searches or. demand could be accepted from outside users at any 
time or at some stated intervals. 
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MANAGEMENT AND PRODUCTION 
Reporting Requirements 

The original URBANDOC specification of a cost-analysis section to the final report did 
not distinguish between the costs of a development effort and those of an ongoing 
operation. This refinement was added as a result of the Advisory Council meeting of 
January 29, 1969. Along with it was added a request — formalized in a letter f«'om HUD 
to the project on February 21, 1969 - for an "Operational Data Analysis Package," The 
two were combined in one report, and submitted to HUD in March. This chapter 
summarizes the URBANDOC findings of spring, 1969. 

URBANDOC as an Operational Model 

In retrospect, it appears that the work program might well have provided for a 
postdevelopmentai period in which a full staff would have prime responsibility for 
simulating production. In lieu of that, it was necessary to isolate certain parts of the 
URBANDOC experience, define the limitations, and consider them part of an operational 
model. .Such an isolation has been accomplished mainly in terms of a five-month period 
from October 1, 1968, to February 28, 1969. Both HUD and the project staff considered 
this period reasonably indicative of postdevelopmentai conditions, more in terms of 
document analysis than in terms of systems. (The preparations for converting from 
second-to third-generation computing equipment constituted the chief disruption of the 
systems stability.) 

From an activities point of view, URBANDOC was performing all the functions of a 
bibliographic information facility; it was acquiring documents, analyzing them, construct- 
ing a machine-readable data base, and utilizing that data base to produce information 
services. The levels of activity were proportionately lower at both ends of the spectrum — 
acquisitions and services — than in analysis and processing. Neither materials nor queries 
were being actively solicited during the five-month period. The assumption was that in an 
operational situation neither solicitation would be necessary. 

The document-analysis staff during the "model" period was represented by one 
experienced senior analyst, one analyst with some experience, and one new analyst added 
at the start of the model period. Total analysis capabilities were diminished when the 
project director spent the last six weeks of the period preparing a special report for HUD. 
As a result, she was largely unavailable for her usual participation in document analysis 
and review. 

Input and Output 

On the Input side, URBANDOC as a model vyas deficient in terms of actual numbers of 
items handled per year. During the five-month period, 1971 documents were considered 
for Inclusion in the system, and 1746 of them actually analyzed. This becomes an annual 
figure of 4000 bibliographic records, fewer than would be expected of an operational 
system. The project felt that it had enough information to project conditions for 8000 or 
even 10,000 items, but that larger projections might be unreliable. 
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On thQ output side, the URBANDOC five-month numbers are less indicative, input Index 
No, 4 was the chief product of the period. The Retrieval Report was not ready until 
January and the Subject Series not until after the expiration of the so-called model 
period. Fourteen machine searches were performed in January-February. The individual 
retrievals varied in length from two to 220 citations. The results appeared to be 
satisfactory to the URBANDOC staff as well as to the recipients. 

In general, output units are much more independent of staff activity than are input units. 
It takes little additional professional effort to run many more searches and to print more 
copies of an index. If a documentation facility must be concerned with increasing the 
number of output units, promotional activities appear to be in order. The kind of output 
units to give such promotion stimulus will relate to the cost analysis. Further 
considerations are postponed until that section of this report. 

Operational Data Analysis 

In early 1969, URBANDOC devoted much of its attention to analyzing the operational 
data and the cost data produced by the five*month period of simulating production. As 
indicated in the previous section, the actual Input and output numbers of that period may 
require some revision in order to be truly indicative of production. The methodology 
developed for analyzing them, however, should prove durable. 

Methodology 

URBANDOC developed, in consultation with HUD, an analytical approach that involved 
examining the entire operations in terms of individual, definable activities. In each case, 
this includes a process and a product. The product is not necessarily a consumer service, 
but a result from a process that, in turn, becomes the input to another process, until 
finally consumer services are realized. 

Each activity was assigned a unique schedule number as identification for discussions of 
operational and cost data. The schedules distinguish between production-oriented and 
developmental activities, thus providing the basis for a documentation model that can be 
adjusted for a particular mix of developmental and operational responsibilities. Each 
schedule can also be considered as representing one variable in cost analysis. 

URBANDOC was unique in that it was self-contained, performing all of its own 
processing (other than actual photo-reproduction). However, the cost analysis 
methodology anticipated a situation in which this might not be the case. Use of in-house 
and external organization capabilities are designated in the worksheet descriptions of each 
activity in the schedule. 

The Process/Product Schedules list and a sample worksheet are presented in the following 
pages. 



List of Process/Product Schedules 



Number 


Process/Product 


010 


Acquisition 


020 


Selection 


031 


Indexing 


032 


Documentation development ' 


040 


Thesaurus and other authority functions 


051 


Data entry: documents 


052 


Data entry: Thesaurus 


053 


Data entry: re-entry 


061 


Thesaurus processing 


062 


Editing and validation (EDP): documents 


070 


Editign and validation; human 


080 


Document storage 


100 


Input processing 


111 


Input Index: process to camera-ready copy 


112 


Input Index No. 4 (product development) 


120 (12M24) 


Publications printing and dissemination 


131 


Retrieval: processing and file maintenance 


132 and 140 


Retrieval: product development 


150 


Documentation services (other than 
publication and retrieval) 


150 (Part II) 


Feedback 




56 



^ 1 



Management and Production 



Sample Operational Data Worksheet 



Schedule No, 051 



Process/Products: Data entry: documents 



1. Process Description'. Keyboarding from document input worksheets 

2. Physical Inputs to this process (e.g, hard copy documents, bibliographic records, 
computer programs, etc,) 

Provided by URBANDOC: 1746 worksheets (product of Schedule 031) 

Provided by externa! organization (s)‘. 0 

3. Human, non-computer processing services to this process (e.gr. editing, validating, 
typing, management, etc.) 

Provided by URBANDOC: KP, 5/12 of 6.60 annual man-months 
Provided by external prganization (s): 0 

4. Data processing services to this process (e.g. data entry, machine operations, 
machine time, etc) 

Provided by URBANDOC: 55% of KP usage machine time 

Provided by externa! organization (s): 0 

5. Product Description: Bibliographic information in machine-readable form, ready 
for computer processing. 

6. Product Itemization: 1764 decklets representing about 28,000 units of 

information 

7. Product Utilization: To Schedule 062, Editing and Verification (EDP): 
Documents. 

8. Product Storage: Temporary storage until successful completion of all EDP. 



Manning Data — General 

All the figures in the operational data analysis were based on the experience of the model 
five-month period. The term totals are actual counts. The manning data are also based on 
actual five-month figures — expressed in 5/12 man-years. This was to provide HUD with 
manning data that could be compared with that of other federal documentation efforts. 



Manning data legend: 

PH=Project Director; DA=Document Analysts; OS^Office Services Personnel; SP=Systems and Computer 
Production Personnel/ KP=Keypunch Personnel. 
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The amount of staff time spent on each individual activity was carefully calculated as part 
of the cost analysis. To the schedules for Process/Products were added five additional 
ones In a 300 series to cover General Administration, Advance Planning, Systems 
Development and Language Conversion, Computer Operations, and Office Services. The 
following schedule gives the percentages of staff time, and the actual man-years devoted 
to each of the activities. 

Manning Data by Function 



Activity 

Acquisition 

Selection 

(Indexing 

Indexing 

Documentation development 
Thesaurus and other authority functions 
(Data Entry — keyboarding 
Documents 

Thesaurus and other authority functions 
Re-entry 

(Editing and Validation— EDP processing 
Thesaurus 
Documents 

Editing and validation— human 
Document storage 
Input processing 
(Input Index 

Processing to camera-ready copy 
Development through camera-ready copy 
(Publications dissemination 
General administration 
Data entry of mailing list 
Mailing list processing 
(Retrieval Report 

Processing and file maintenance 
Product development 
Document services (other) and feedback 
General administration 
Advance planning 

Systems development and language conversion 
(Computer operations 
Management 
Maintenance 
Office services 

•Indicates summary figures by group. 



Product/ Process 

Schedule % of Time 


Man-Years 


010 


1.9 


.18 


020 


.3 


.03 


030 


(24.6 


2.36)^ 


031 


22.5 


2.16 


032 


2.1 


.20 


040 


3.4 


.33 


050 


( 7.1 


.68" 


051 


5.7 


.55 


052 


.8 


.08 


053 


.5 


.05 


060 


( 2.7 


.26* 


061 


.5 


.05 


062 


2.2 


.21 


070 


4.0 


.38 


080 


3.1 


.30 


100 


.5 


.05 


110 


( 1.0 


.10)* 


111 


.5 


.05 


112 


.5 


.05 


120 


1 6.1 


.58)* 


122 


4.2 


.40 


123 


1.6 


.15 


124 


.3 


.03 


130 


( 1.0 


.10* 


131 


.5 


.05 


132 


.5 


.05 


150 


5.0 


.48 


300 


4.3 


.41 


310 


3.5 


.34 


320 


18.3 


1.76 


330 


( 6.5 


.62)* 


331 


2.7 


.26 


332 


3.7 


.36 


340 


6.7 


.64 




100.0 


9.60 
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Cost Analysis 

General Considerations 



Management and Production 



The five-month experience during the model period also furnished the basis for analyzing 
the costs involved in operating a documentation facility. The five-month figures were 
projected for a year. In a few Instances, figures were modified to eliminate strictly 
demonstration-status elements from the financial picture.. All expenses include the value 
of noncash local contributions. 

Since UR BAN DOC was a self-contained project (not attached to an urban renewal libiary 
or research effort), there was no option for using an incremental cost approach. If 
another documentation effort had such institutional arrangements, it might be possible to 
arrive at a more advantageous cost by considering only the incremental items in the 
budget. While many figures in the UR6ANDOC analysis would be appropriate for 
inclusion there, others would be absorbed in the overhead. 

Many other considerations also affect the cost: 

1 . The amount of development work necessary; 

2. The mix of the document base, there being a wide divergence in cost between 
handling major governmental reports and periodicci articles, fairly standard 
Urban Planning Assistance Reports, and unique materials; 

3. The amount and kind of analysis per document. 

The figures in this report represent intensive effort in all three areas. The developmental 
work would undoubtedly be less in an operational situation. Other variables might be 
lowered as a result of a trade-off between operating costs and product packaging. 



Annual Budget 

The total budget for the model period can be considered as 5/12 of $190,000. In the 
follovt/ing discussion, all figures are annual, applicable to four thousand documents of 
input plus the output already discussed. Three major cost analyses were made: input and 
output costs; operational and developmental costs; and direct and indirect costs. 

Major attention was devoted to the salary and wage items, which totaled $95,000 for the 
year. This figure included a staff of ten. The full-time staff consisted of a project director, 
th^'ee document analysts, two programmer-systems analysts, and two secretaries. The 
senior systems analyst was two-thirds time for the year; the key punch operator was half 
time. No other salary costs were involved since the accounting services were part of 
overhead and the computer operations were all performed by the URBANDOC staff. 
Messenger services to and from the computer center were also included in the overhead. 

The three nonsalary parts of the budget were as follows: 
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1. Equipment costs: data entry devices; computer services; office equipment. 
(All computer time was assigned a cash value.) 

2. Salary-and wage-related costs: fringe benefits, overhead (includes shelter and 
various administrative expenses). 

3. Miscellaneous costs: travel; Advisory Council, communications, printing (for 
field-test issues), supplies, contingencies. 

Personnel Cc i %i luiysis — Methodology 

UREANDOC was most interested in determining the various personnel costs of the 
project, both in dollars and in time. The total personnel budget was first divided into five 
major sections: project direction, document analysis, systems and production, data entry, 
and office services. Each staff member was ah'^'cated to one of these sections. Each 
section was represented graphically by a universe whose entirety represented 100 percent 
of the total staff time for that section. Although it would have been possible to include 
data entry with office services, the division was made so that in the future the cost of 
data entry arrangements could be compared with keypunching. 

The percentages of staff time spent on various functions within the designated section 
were then estimated. The manning data schedules provided the categories for time 
analysis of staff activities, both section by section and in its entirety. Although some of 
the percentages may seem rather arbitrary, they were compared informally with estimates 
available from other documentation efforts, and found, to be consistent with outside 
experience. 

The five universes follow: 
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ill 



Project Direction 



Management and Production 




Systems and Production 






t This high percentage Is due to a change In the data processing environment. In an opera- 
tional situation with a stabilized data processing environment this figure can be expected 
to drop to approximately 30 %. 

* This figure is somewhat high due to the necessity, under present budgetary and adminis- 
trative constraints, of using off-site office services. 



^ f 



Management and Production 



Data Entry 
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Personnel Cost Analysis— Dollar Compilations 

Calculations were also made for the dollar cost of the activities within each universe. The 
various costs were then assembled in terms of total staff by activity. In addition, each 
activity cost was divided by 4000 to arrive at a cost per item of input. 



Unit 

Product/Process Cost/4k 



Activity 


Schedule 


Annual $ 


Docs. 


Acquisition 


010 


$ 1,548.50 


$ .39 


Selection 


020 


570.00 


.14 


(Indexing 


030 


(25,203.50 


6.30)* 


Indexing 


031 


22,638.50 


5.66 


Documentation development 


032 


2,565.00 


.64 


Thesaurus and other authority functions 


040 


3,800.00 


.95 


(Data entry — keyboarding 


050 


(2,584.00 


.65)* 


Documents 


051 


2,090.00 


.52 


Thesaurus and other authority functions 


052 


304.00 


.08 


Re-entry 


053 


190.00 


.05 


(Editing and validation — EDP processing 


060 


(2,945.00 


.74)* 


Thesaurus 


061 


589.00 


.15 


Documents 


062 


2,3f.'6.00 


.59 


Editing and validation — human 


070 


4,256.00 


1.06 


Document storage 


080 


1,567 50 


.39 


Input processing 


100 


589.00 


.15 


(Input Index 


110 


(1,178.00 


.30)* 


Processing to camera-ready copy 


111 


589.00 


.15 


Development through camera-ready copy 


112 


589.00 


.15 


(Publications dissemination 


120 


(2,954.50 


.74)* 


General administration 


122 


2,090.00 


.52 


Data entry of mailing list 


123 


570.00 


.14 


Mailing list processing 


124 


294.50 


.07 


(Retrieval report 


130 


(1,178.00 


.30)* 


Processing and file maintenance 


131 


589.00 


.15 


Product development 


132 


589.00 


.15 


Document services (other) and feedback 


150 


6,042.00 


1.51 


General administration 


300 


6,906.50 


1.73 


Advance planning 


310 


4,968.50 


1.24 


Systems development and language conversion 


320 


18,411.00 


4.60t 


(Computer operations 


330 


(6,954.00 


1.74)* 


Management 


331 


2,945.00 


.74 


Maintenance 


332 


4,009.00 


1.00 


Office services 


340 


3,344.00 


.84 






$95,000.00 


$23.77 



* indicates summary figures by group. 

t This cost is due to a change in the data-processing environment, shift to third-generation equipment 
and conversion to COBOL. Under more normal conditions, this cost can be expected to drop 
proportionately. 
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Management and Production 



Computer Costs 



The machine time wasdivided into four functions: production, systems development and 
language conversion, computer operations maintenance, and miscellaneous processing. 
Cost schedules were developed for each part of production, using the same number 
assigned to personnel, with the addition of the prefix C. The other functions were 
assigned schedules in the C 300 series, to be consistent with personnel overhead costs. 

Af)aiysis of Computer Operations— Over-Aft 



Activity 

Systems and production 
Thesaurus 

Editing and validation— Doc. 

Input processing 

Index to camera-ready copy 

fndex through camera-ready copy 

Mailing list 

Retrieval Report 

Processing and file maintenance 
Retrieval ffeporf product development 
Systems development and language conversion 
Computer operations maintenance 
Miscellaneous processing 



% of Annual Annual 
Schedule Operations Hours Cost 

44.67 201 $10,050 

C061 

C062 

ClOO 

cm 

C112 

C124 



C131 

C132 



C320 


37.56 


169 


8,450 


C332 


11.55 


52 


2,600 



6.22 


28 


1,400 


100.00 


450 


$22,500 



Computer costs are estimated on the basis of $50 per hour valuation of the following 
configuration at the Baruch College Computing Center: IBM 1401 12 K Computer, with 
four tapes, two disks, and operator console. It is rented equipment, and used on the first 
shift. 
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Computer Costs Directly Related to Product/Process 







Hours/ 


Annual # 


Annual 


Unit 


Activity 


Schedule 


Cycle 


Cycles 


Cost 


Cost 


Thesaurus processing 


C061 


3 


6 


$ 900 


$ .23 


Editing and validation-doc. 


C062 


10 


6 


3,000 


.75 


Input processing 


ClOO 


3 


6 


900 


.23 


(Input Index 


Clio 


(6) 


(3) . 


(900 


.23) 


To Camera-ready copy 


cm 


4 


3 


600 


.15 


Through camera-ready copy 


C112 


2 


3 


300 


.08 


Mailing list 


Cl 24 


6t 


3 


900 


.23 


(Retrieval Report 


C130 


(11.5) 


(6) 


(3,450 


.86) 


Processing and file maintenance 


C131 


3.5 


6 


1,050 


.26 


Product development 


C132 


2 


24tt 


2,400 


.60 






33.5 




$10,050 


$2.53 


* Summary figures by group. 












t Indicates all processing, except for 


mailings, in 


AUTOCODER mailings 


in COBOL, 


therefore 



relatively higher hours/cycle. 
tt Or four times per updated cycle. 



Cost Summaries 

ITie costs arri summarized in terms of three kinds of activities: preparation of the data 
base (input), products and product development (output), and overhead. The costs 
ascribable to these activities are summarized in terms of personnel, equipment, and 
others. If desirable, the input and output figures can be considered direct costs, and the 
overhead indirect. 

The summaries are based on the figures In the previous schedules, but reflect certain 
additional considerations. The costs for systems development and language conversion 
were broken down into 20 percent for direct input activities, 40 percent for direct output 
activities, and 40 percent for overhead. (The project could not forsee a time in the 
predictable future when it would be possible to operate a documentation facility strictly 
on the basis of already developed computer programs and systems.) It also appeared 
desirable to be able to consider how changes in input and output volumes would affect 
the costs. 

The greatest effect on the summary figures is in the personnel column. The system cost 
reallocation raises the unit costs from $10.62 to $11.54 for input and from $2.98 to 
$4.82 for output (both with a 4K document base). Overhead is reduced from $10.15 to 
$7.41 per unit. These changes do not affect the total personnel cost of $23.77 per unit. 
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Management and Production 



Summary of Activity and Unit Costs Based on 4000 Documents 





Personnel Equipment 


Other 


Total 


Activity 


Costs 


Costs 


Costs 


Costs 


Preparation of data base (Input) 


$11.54 


$1.56* 




$13.10 


Products (Output) 


4.82 


2.36 


$ 1.25t 


8.43 


Overhead and other 


7.41 


1.91 


16.68tt 


26.00 


Totals 


$23.77 


$5.83 


$17.93 


$47.53§ 



* $1,33 for computer services and ,18 for data entry, 

t Printing costs for product development 
tt The following costs were Involved: 



Fringe benefits 

Administrative costs includinp rent 
Travel 

Advisory Council 
Communications 
Supplies 
Miscellaneous 



$ 3.56 (really part of personnel) 
6.90 
.30 
2.12 
.50 
2.00 
1.30 



§ The total annual budget of $190,000 divided by 4000 documents amounts to $47.50 per unit. The 
difference between this and $47,53 is due to rounding in various calculations and analyses. 

Many of the unit figures are unnaturally high, but anavoidable with a 4000 document base. 

Projections 

Productivity 

UR BANDOG began its projections for an operational service with the assumption that 
there would be a minimum turnover from the present staff and that normal improvement 
factors would increase its productivity. Additional positions would be created as essential 
and in a wr;y to minimize the time necessary for recruiting and training. 



Staff Function 


Present Staff 


Projected Staff 


Project director 


1 director 


1 director + V 2 






administrative assistant 


Document analysis 


3 analysts 


4 analysts + 1 subprofessional 


Systems and production 


2% 


2% (same) 


Data entry 


V 2 operator 


1 operator + 1 subprofessional 


Office services 


2 office assistants 


2 office assistants 






+ administrative assistant 


Contingencies 




Part-time as necessary 




10 people 


14 people 




(2 part-time) 


(1 part-time) 
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